Tuesday, July 18, 2023

The 11 Aspects of Good Code

Lessons on code quality start in the first few weeks of learning to program, when a newcomer to the field is taught the basics of variable naming and told why programming languages have comments. They continue in countless blog posts and in every debate on a pull request.

Avoid it or embrace it, code quality training permeates one's entire career.

But it is so easy to lose sight of why.

Says the skeptic: pretty code is a distraction, like the gargoyles on a cathedral or the curlicues of Baroque architecture.

Says the maximalist: Have you ever had a debugging session that demanded you turn over so many stones it felt as if the world was crumbling under you? You don't achieve 10x velocity by becoming 10x faster at debugging, but by writing code that doesn't need debugging at all.

And says the guru: that which is merely a pretty distraction is not code quality.

I've made a professional quest to clarify all the fuzzy terms of software engineering. Students of our course learn definitions of code knowledge and coupling in the first lessons. One of my research papers gives a rigorous definition of dependence.

Today, I carry this quest to its apotheosis: what is quality code?

To this question, there can be no short exhaustive answer. Asking “what is good code” is a lot like asking “how do chemicals work?” It's the subject of an entire field.

But we can more easily ask what is the purpose of chasing code quality, even if achieving it is a craft worthy of a lifetime of study. To recognize quality code, we begin by asking: what are the external and internal properties quality code should have?

External Properties

Good code is done code

We begin with the cliché. All discussions of quality are grounded in the ultimate purpose of the object being designed. The purpose of the vast majority of code is to be executed as software which accomplishes some goal, be it entertaining people, helping them with their taxes, shuttling data, or testing other code. There is also a minority of code built for other purposes: experiments to see if something is possible, examples to explain a library or algorithm, and code that does tricks such as printing its own source.

For all of these, code that fails to achieve its purpose cannot have extrinsic quality any more than an abandoned construction site can be a useful building.

But this does not justify single-mindedness in getting a program to work. There are also non-functional requirements such as performance. Software cannot be quality if sluggishness sends users back to pen-and-paper.

And though an entire business may be dedicated to helping a software product fulfill its purpose, that does not subordinate all other functions to “code quality.” Whether code is quality cannot depend on factors that lie entirely outside the realm of engineering. The failure to market a software package does not make it bad, and a top-down directive requiring people to use it does not make it good.

And so good code is done code.

But we cannot stop there. That is not the end of the story. It is just the beginning.

For if you say “We got it done and delivering value to the customer,” that is not an excuse to your boss when you explain why adding a feature to wish users “Happy birthday” will take several years. And it is not an excuse to yourself when you spend 4 days debugging an issue that turned out to be a typo. Done code is not good code.

Good code is understandable

By one definition, an engineer is someone who understands a system at a deep level.

And, it follows that, for code to have good engineering, it must be understandable.

And sometimes, such as for teaching code, this is its entire purpose.

So you want code to be understandable. But understandable to whom?

To yourself and the people who need to read it.

Or more specifically: to those people at the time they need to read it.

The foolish engineer is offered a new skill, one that will shrink a segment of his code by a factor of 10. “No-one else will understand this” he says as he refuses to learn it. He thus reveals a low expectation of himself masquerading as a low expectation of others. He has hidden within his comfort zone of skill.

The arrogant engineer has a skill and knows it will be effective. “Anyone who cares about this code should be able to learn the technique I used.” If there are to be suitably ambitious readers, the choice turns out correct; but if the audience is one whose concerns lie elsewhere, then it does not. But an uninformed decision cannot be a good one. He has hidden within his comfort zone of empathy.

Either can improve by breaking out of their comfort zone, learning which walls to climb and how, and aiding the rest through construction and placement of ladders.

But the arch-engineer of engineers breaks the comfort zone itself. They are not concerned with climbing nor ladders, for those who follow shall suddenly find themselves atop mountains.

Good code is evolvable

Software is not a point in time but a system.

Precious few programs stand like museum pieces encased in glass, existing only for their own sake, or illustrating a piece of the frozen past. The rest are connected to other programs, to platforms, to growing businesses and rotating customers. They are connected to a changing world.

And now, as we spend our lives glued to screens that came from robotic factories and arrived via satellite-controlled ships, software is the changing world.

You cannot change the world without changing its software. Every software engineer carries the professional burden of building software that is easy to change.

And specifically, to change from one desirable state to another.

We must avoid creating rigid code that is difficult to change at all.

And we must also prevent brittle code that can all-too-easily be changed to something broken.

Good code is easy to extend and difficult to break. The power of a design lies not in what it can do, but rather what it can't do.

But why prepare for a future that may never come? It is impossible to predict the exact ways code will change.

Yet it is often easy to predict that code will change. Or even where. And that's all that's needed to create evolvable code.

Yes, it is a folly to design assuming certain changes will need to be made as life takes a certain path. But it is a greater folly to design as if no changes will occur at all.

Internal Properties

It is a lofty goal to say that a program must be correct, understandable, and evolvable.

It is an achievable goal to say that a single function should pass its tests, have few branches, and use abstract types.

And yet the summation of the latter yields the former. Extrinsic quality comes from intrinsic quality. These properties are presented below:

Good code can be understood modularly

Programs are composed of files. Files are composed of declarations. Declarations are composed of lines.

That is to say, programs are built out of pieces.

And every single piece has its purpose.

Every time a line is executed, a change occurs, and there are many true statements that can be said about each change. Most such statements are of no consequence, while some are crucial to the ensuing lines achieving their purpose.

But in some programs, there is a third category of statements. There are facts about the program state that become true on some line, and then are of no consequence until some distant line requires them to be true. Whereas most lines are of concern only to their neighbors, these two lines have grasped hands through a wormhole, their fates entangled. A change in one place can cause breakage on the other side of the universe.

In good code, the purpose of every line can be stated simply. Each line can be understood in isolation. For each, one can reason: if some simple fact about the state of the program is true, then, after running this line, some other simple fact will be true. The assumptions and guarantees of each line click together like Legos, forming simple and correct functions, which in turn click together into simple modules and simple programs.

In bad code, you read a function, ask whether it works, and then read a dozen more in order to have an answer. Changes must be made as tenderly as one playing Jenga, lest the tower collapse.

Good code works by design. Bad code by coincidence.

Good code makes it easy to recover the intent of the programmer

A programmer dreams a new entity. Her mind gradually turns dream into mechanism, mechanism into code, and the dreamed entity is given life.

A new programmer walks in and sees only code. But in his mind, as he reads and understands, the patterns emerge. In his mind, code shapes itself into mechanism, and mechanism shapes itself into dream. Only then can he work. For in truth, a modification to the code is a modification to the dream.

Much of a programmer's work is in recovering information that was already present in the mind of the creator. It is thus the creator's job to make this as simple as possible.

But to do so is a constant struggle.

Every naming decision is a quest to find the word that conjures in the reader's mind the true purpose of the named while warding off misconceptions.

Every function, a quest to carve behavior into something meaningful.

Every module, a quest to create new words that give new powers to the wielder.

Through each such step, we climb towards the ideal of making the program written not in the language of the machine, but in the language of the dream.

They say that for those who have reached the peak, they can simply dream changes to the program and it is instantly so. But great powers are had even by those who only make it partway.

Good code expresses intent in a single place

But it is not enough for it to be easy to go from code to design. It must also be easy to change the design to new code.

The shaper of atoms walks into a room under construction, wide open and brightly lit. “No!” he cries. “I want it to be dark and intimate.” Before him a vast itinerary of work is created, as that one directive demands thousands of strokes of the paintbrush and new choices for every object so contained.

The shaper of bits walks into a website, and says she wants a dark mode. In good code, she speaks the new colors that comprise a dark mode, and it is so. In great code, she merely speaks “dark mode” and the colors are inferred.

Yet all too often such a change, though simple, requires tweaks in thousands of locations, like a thousand well-coordinated strokes of the brush.

The bits should be easier to change than the atoms, for they live inside the machine.

Yet they can be harder, for there are so many more of them.

Good code is robust

If every line serves a purpose, then every line must be correct.

That means that every line is a new opportunity for a mistake to slip in unnoticed.

And how easy it is to make a mistake is something under the control of the software designer.

Some codebases are so treacherous that working in them is like a tightrope walk across the Grand Canyon. There are functions which require consulting a tome to invoke correctly. Writing to a data structure can produce nonsense. Reading from a data structure may produce only a partial story.

Other codebases are more like an elevator ride, to the point where not even deliberate effort can produce an accident. In such code, APIs have guardrails, where any misuse is either disallowed or can only be accomplished by spray-painting on a red flag. Try as you might, no write to a data structure can produce nonsense. If it compiles, it probably works.

As Tony Hoare says, one can write code so simple there are obviously no bugs, or so complex that there are no obvious bugs.

If you must think as hard as you can to check that a program works, it probably doesn't.

But in good code, you barely need think at all.

Good code hides secrets

Software is not a point in time but a system.

And it is not one system, but many interacting ones.

And each is constantly morphing.

But if it looks and acts the same on the outside, no-one will ever know.

It does not matter to the driver when a car's engineer changes its wiring. Unless, that is, the manual had told her in great detail what to expect from its electrical system and she had come to depend on it. The one who learns the car's battery can charge 5 cell phones for exactly 433 minutes before dying is the one able to achieve maximum performance. But, in a changing world, the one who uses this forbidden knowledge tiptoes close to ruin.

Subsystems are joined when their creator's minds are joined, in conversations that should not occur sharing details that should not be shared. Or when the single master fails to erect a firewall in his own mind.

The hotshot boasts about knowing everything. She creates software that can only be worked on by her fellow all-knowing.

The master's virtue is knowing nothing. And that's enough to maintain his software.

Good code isolates assumptions

Minimizing use of knowledge is the path to evolvability. Secrets are but the extreme, known only to their owners. Every use of knowledge ties the program to the World That Was, hindering the creation of the World That Could Be.

For every datum, there are the components that create it, the components that use it, and those in between that merely deliver it. Do those components pass the datum along like a sealed package? Or are those couriers prying into its contents?

A value is passed from one end of the program to another. Every function on the way that calls the value an “int” is another barrier to making it a float. And every function on the way that calls it anything is another barrier to turning this value into two numbers.

The physical world is full of irreversible changes. Build a building and the town shapes around it; burn it back down and forever shall the wind be tainted with its ashes. But when it comes to reshaping the world of bits, the only thing in a programmer's way is himself.

Good code is open

Programs deal with a domain, and both program and domain can be sliced countlessly many ways into sets. Sets of options! Sets of fields! Sets of formats! Sets of formats of fields which represent options!

And as programs and domains change, such sets grow and shrink. When good code deals with such a set, it is to the extent possible agnostic to the set's size.

The simpleton sees an entity with two possible values, and builds the program using a boolean. The next day, the possibilities have grown to three. A rewrite is required.

There is a set of three kinds of entities, and a program is written that can work with each of them. Then comes the day where one is deprecated. If the program was open in this set, then the relevant code is already in a box that can be discarded. If the program was closed, then branches all throughout have become skeletons demanding burial.

The open-minded person accepts new things. So does the open program.

Good code uses a programmer's full wisdom

The journeyman programmer finds a list of 10 principles for good code. She studies them one by one, and after years of toil attains mastery. Before her the baffling complexity of programs stood as stalagmites of wax; before her gaze, it has now melted down and separated into buckets.

The path there is one of toil. Every place where intuition says the code could be simpler, she seeks how. Every issue that was hard to debug, she searches for how it could have been prevented.

And then she declares “that is all.” Her apparent mastery has brought her respect, and her skill cleaves problems that foil others. She accepts her place at the top and rests.

But the one with the potential to become a grandmaster does not rest. They notice the dregs of wax that fall outside the buckets and see in them opportunities to find new explanations. As they search ever deeper, the buckets dissolve and reveal the interconnected whole. As with the programmer learning a codebase, they have stepped into the dream behind the concepts, and are now ready to dream themselves.

This list came from years of observation, reflection, study, teaching, and refinement. It is yours now to study, criticize, preach, ridicule, and extend.


On modularity: The 3 Levels of Logic

On intent: My Favorite Principle for Code Quality

On Robustness: State of emergency! The four ways your state might be wrong

On Secrets: David Parnas, “The Secret History of Information Hiding”

I mostly lack public resources on openness and the sequestering of assumptions, although refunctionalization is one technique for achieving it.

But for all of these, the best way to learn them is through deliberate practice.

And for that, we have the Advanced Software Design Course.

Thank you to Nils Eriksson, Jun Hong “Nemo” Ya”, Emmanuel Genard, Paul Weidinger, and Yongming Han for comments on earlier drafts of this essay.

Thursday, March 30, 2023

'Modules Matter Most' for the Masses

The title of this post should be in double quotes, but it seems parts of Google have not learned about abstracting sanitization. Screenshots of bug here.

As systems grow, they become more complex, overwhelming the mind. And yet we humans have a trick to build objects of far greater power than could be understood by any one person. That trick is modularity: the ability to create something out of parts which exist independently yet combine into a far greater whole.

A funny thing happens when a system is broken into modules. A new object appears, something separate from and longer lived than the system itself: the interface.

The most powerful explanation I've found of the primacy of the interface is Bob Harper's Modules Matter Most. According to Harper, the world has successfully taught that good interfaces have multiple implementations, but so few know the other half of the story. The essence of modularity is a simple theory, with the formalism fitting in just two lines. Yet it's obscured in most languages; only two are suitable for teaching it. And if you're thinking it's some academic language like Haskell, then you're half wrong.

And there were zero suitable blog posts for explaining that. Because:

Rumor has it that Professor Bob Harper's second career choice would have been a priest. Judging from his lectures: what a priest he would be! He rains fire from the pulpit about how the idea of a variable is mankind's greatest invention, how homotopy type theory is “the mother theory” and mathematics is a subfield of computer science, and how a single type rule contains the secrets of software engineering, all while he preaches the gospel of the “Holy Trinity” of type theory, proof theory, and category theory.

Alas, in contrast to his lectures, Harper's writings lack the incendiary power to catch on in those not already stuffed with the proper kindling. Much as I personally enjoy Modules Matter Most, many students find it impenetrable thanks to its casual references to deep ideas in type theory, and we field questions from students confused about it several times per year.

For the first time, I shall now explain the ideas in Harper's article for a broader audience.

Modularity means module types

We are told that “modules should depend on the interfaces of other modules, not their implementations.” There's a lot that can be said about what it means to depend on something. But there are some subtleties as well to what goes into an interface.

But wait! Lots of languages have interfaces built in. Here's an interface for a stack in TypeScript.

interface Stack {
   isEmpty : () => boolean;
   push    : (x: number) => Stack;
   pop     : () => [number, Stack];

It is true that TypeScript has a keyword “interface” and an associated language construct. As do Java, Objective-C, and many other languages. But these are far more limited and restrictive than the general concept in software engineering. Confuse them at your peril.

The TypeScript code may look straightforward, but it actually falls far short of being an interface to an arbitrary stack implementation. I'll explain how in full later. But for now, I'll point out a more basic failing: This is the interface for a class, not a module. I will be sticking with stacks as my running example, but we want something that can generalize to, say, the Node Filesystem module. This module contains multiple distinct classes, such as file handles, directories, and output streams. It also contains top-level functions with special access to these classes; the classes also have special access to each other. They are interrelated in a way where giving an interface to just one class at a time will be missing something. There is a module boundary, in that code inside the module is given capabilities denied to code outside.

Now, go try to write a TypeScript interface that lets you share information with other classes in the same module but not in other modules. You just can't.1

We need to talk about the interface of the file system module as a whole. And the interface keyword just won't cut it.

Lots of languages, including TypeScript, have a construct they call “modules.” Yet when we look at the documentation, it only speaks about imports and scoping. There is nothing about modularity, information hiding, or any of the other properties we've learned to associate with the word “module.” They are empty shells containing a bundle of definitions. In the right hands, they can bundle together cohesive functionality with shared secrets. But I prefer to call them what they are: namespaces.

Real Modules

But there is another construct that better captures what a module is: a bundle of functions and types whose internals may be hidden from the outside world. And that construct is, appropriately, called “modules.” Proper modules. But sadly, they can be found only in the ML family of languages, particularly OCaml and SML. Here's the interface for a stack module in OCaml:

module type Stack = sig
  (* Types *)
  type stack
  (* Functions *)
  val mkEmpty : () -> stack
  val isEmpty : stack -> bool

  val push : int * stack -> stack
  val pop  : stack -> int * stack

This does resemble that TypeScript interface. And it should, as the bulk is defining the same set of functions (plus one missing from any object interface). But there are some extra features it has that the object interface does not.

First, notice how that opaque stack type is declared separately from its operations. We could easily add more types, and their operations would have full access to each other. For the file system, I could go

module type FileSystem = sig
  type filehandle
  type dir
  type fs_watcher


and the internal representations of file handles, directories, and watchers would be freely accessible within any module implementing the FileSystem type, but hidden without. The abstraction barrier is placed around the module, not around each type.

Another feature is the ability to write programs not tied to any single implementation of the Stack interface. To do this, we write a function which takes a stack implementation as input, and returns as output a module containing the rest of the program.

module MyProgram (S : Stack) = struct
  (* Definitions go here *)

Thanks to these module-valued functions2, OCaml programmers can write modules which are not bound to any particular implementation of their dependencies. The general term for this is “dynamic architecture,” and most languages support it poorly. They rely on #ifdef's to do this — or worse, the build system. Dependency injection is a common antidote, but only part of the answer. You can parameterize the Stack implementations in Java, but the easy way is wrong, and the correct way is complicated. As I'll explain later, to do it properly, you must first stuff everything about a module, including multiple types, into a single injected object.

Realer Modules

But, even in a language with real modules, there is still something missing.

Let's do away with these special words “push” and “pop,” and just call the operations “add” and “remove.” Now we can create a Queue interface that also has “add” and “remove.” How would it compare to the Stack interface? It would be identical! And yet I expect that people who want the two interfaces merged are about as common as people who demand to be served first because they entered the line last.

This shows that, when we think about interfaces, we are not just thinking about whether a module contains functions with certain names and certain shapes of inputs and outputs. We are also thinking about the specification — the behaviors we expect these functions to have. Perhaps there is a comment next to pop() saying that it's the inverse of push(). This fact can be considered part of the module type every bit as much as the types of each constituent function because knowing this is necessary to actually use the module. So let's make it actually part of the type.

There are a few languages with type systems powerful enough to express the relation between push() and pop(). Coq, Agda, Idris. I can show you what it looks like in one of these, but there are already enough uncommon notations in this post. So here's what it looks like in a hypothetical extension of OCaml. In this example, I've extended the Stack module type with two properties: that isEmpty returns true on a freshly-created empty list, and that pop is the inverse of push. I can add more, but this is enough for illustration purposes.

module type Stack = sig
  (* Types *)
  type stack

  (* Functions *)
  val mkEmpty : () -> stack
  val isEmpty : stack -> bool
  val push    : int * stack -> stack
  val pop     : stack -> int * stack

  (* Properties *)
  property mkEmptyIsEmpty : isEmpty(mkEmpty ()) = true
  property popPush        : forall s n, pop (push (n, s)) = (n, s)

You may have noticed that I used the same colon (:) syntax when stating the properties that I used when giving the types of functions. That is not a coincidence. In a sense, I can think of the isEmpty value as a proof that it's possible to take a stack and produce a boolean.3 Similarly, mkEmptyIsEmpty is a value whose type is a proof that: isEmpty(mkEmpty ()) = true. In a language that can do this, the expected properties of each function in a module are physically part of the module's type. In other languages, they are still morally part of its type. We put these expectations into our documentation as a simulacrum of languages that can do this first-class.

When we talk about only relying on the signature or interface of a module, it means to only rely on representations and properties which are exposed in the module type. In this post, I am primarily using the term “module type” to avoid preconceived notions about what “interface” and “signature” mean. But, in correct usage, the type, signature, and interface of a function or module are the same thing. 4 And types are the end-all be-all of interactions between program components. For, in a sufficiently powerful type system, every useful fact that can be stated about a function or module can be stated as part of its type.

And now we are ready to explain the type-theoretic account of modularity.

Modularity in One Rule

Type theory is the science of taking all the beauty of programming and compressing it down to a few opaque lines full of Greek letters. Alternatively, type theory is the art of discovering how subtle aspects of mathy definitions give rise to strange new universes of programming models.

I've written before how a single rule explains why mutable and immutable data models act differently. And Bob Harper claims that modularity is characterized by a single rule. Rather than relying on me interpreting it for you, I think it's worth diving in and facing this source of truth directly. After developing the idea that everything you might want to say about a program can be reduced to a type, modularity becomes astonishingly simple.

So here's the rule. Rules like this are read: if each part on the top is true, then each part on the bottom is true. Each of these parts is called a “judgment.” I'll break it down for you.

And now, an explanation of each judgment in the rule:

This part means that the module M has module type (i.e.: interface) A. This judgment is read “gamma entails M has type A.” Here, Γ is called the context, and contains everything else that has been declared in scope. It doesn't change much, so we usually don't need to talk about it. Also, let me remind you that a sufficiently powerful type system lets you stuff arbitrary correctness requirements in the type, so we can read “M has type A” as saying “M is correct.”

This means that module N has module type B. Within the definition of N, it may use an unspecified module x that has module type (interface A). This must be true for any module that satisfies that interface.

This part introduces some new notation: “[M/x]N” is read “M substituted for x in N.” Taken together, this judgement means: We take N from above, which used some unknown module x that has module type A. Now, we replace x with M, the actual module definition that has module type A. After doing that, this new module, the composition of M and N, still satisfies that same type B, and is hence correct.

Now, bringing it all together, this rule means: We have a module N which uses some module M, but only through its type (interface) A. M can be replaced by any other module with the same type, and N will continue to work. That's modularity.

Many to many

The relation between module types and modules — more generally, between interfaces and implementations — is many-to-many. One direction of this is pretty common knowledge: an interface can have multiple implementations. But the other direction is equally important and less appreciated: implementations satisfy multiple interfaces. 5 This is a succinct statement of a special case of the lesson in my post The Design of Software is a Thing Apart: it is not possible to look at an implementation and recover its intended interface because there are multiple possible such interfaces.

This is such a general idea that it applies in the physical world too. If you see someone baking cookies with coconut oil instead of butter and using dairy-free chocolate chips (implementation), it's not possible to determine what dietary restrictions (interface) they are trying to satisfy: vegan, dairy allergy, or unrestricted but just prefer the taste. And going further: vegans can arrive at the exact same set of dietary restrictions for completely different reasons (design intentions), be it out of concern for animals, religious practice, or belief in health or environmental benefits.

Now back to software. To see that the stack interface satisfies multiple design intentions, just think of two different applications of stacks; anyone working on either could reinvent the idea of a stack if they had never encountered it before. Now, to show you an implementation satisfying multiple interfaces, I'll need to give you a concrete implementation of a stack. I am going to give the full definition in OCaml, but you do not need to understand it. If the syntax is unfamiliar to you, you can take my word for it that this is the normal implementation of a stack based on linked lists.

module ListStack : Stack = struct
  (* Types *)
  type stack = int list

  (* Functions *)
  let mkEmpty () = []

  let isEmpty s = match s with
    | [] -> true
    | _ -> false

  let push (x, s) = x :: s

  let pop s = match s with
    | [] -> failwith "Cannot pop from an empty stack"
    | x :: xs -> (x, xs)

  let size = List.length
  let iter = List.iter


This implementation contains way more information than the interface. Just as you could write pages describing the dust specks on your doorknob, you can write endless facts that are true of this implementation. You can explain that the error message when popping from an empty stack starts with “C” and ends with “ck,” for instance. And I've added on a few extra functions, size and iter, to make it easier to actually write useful facts about this implementation not guaranteed by the Stack module type, even the version augmented with properties. Here are six other module types the ListStack implementation also satisfies:

  • An extension of the Stack module type that also guarantees all functions run in constant time.
  • An Iterable module type of modules with an iter function.
  • A Sizeable module type of modules with a size function.
  • A module type giving everything in the Stack module type along with the presence of the iter and size functions, but no additional properties about the new functions
  • A module type with a property giving the relation between the size, push, and pop functions
  • A module type containing everything in the Stack module type along with additional properties giving the relations between push, pop, size, and iter.

We see there are multiple ways to find a new module type. We can focus on functions of the implementation not present in Stack. We can omit some of the functions. We can add extra facts about the functions in Stack. And, when we add new functions, we can add properties about how they relate to the old. No realistic caller of ListStack will rely on everything that can be said about it. And that means that each caller would still function if ListStack was exported through the minimal interface containing exactly what that caller relies on.

And we can actually do this, for, not only does ListStack satisfy all these module types, but it does so with no prior arrangement. If I want to export ListStack through a CheckEmpty module type that only contains the isEmpty function, I do not need to go back and make sure the Stack module type is declared as a subtype of CheckEmpty, as I would for a Java class. I just write ListStack : CheckEmpty and it works.6

The Perils of Java (and Haskell) Schools

Harper names OCaml and SML's support for modules as a reason to have one of them supplant Java in CMU's undergraduate curriculum. When I first read this piece as a CMU undergrad over 10 years ago, I thought Harper was just making excuses to foist his favorite language onto everyone else. Now, I think he's right.

Java is just bad at giving multiple interfaces to the same implementation. They've tried teaching this kind of modularity in Java for years and failed. I now face this myself: when I ask students to give examples of a single implementation with multiple interfaces, around half the time I get an example of a single interface with multiple implementations.

But Java has objects, and objects at least make it easy to give multiple implementations to the same interface. Right?

In the hands of a skilled user, yes. But even demonstrating that is not easy.

Let's try translating the Stack module type into a Java interface. Here's the obvious way:

interface Stack {
  Stack push(int x);
  int pop();
  boolean isEmpty();

This may look like a straightforward translation, but it's actually very wrong. The pop() method doesn't return a new stack, so this implementation has to be imperative. There is no mkEmpty() function, meaning there must still be a constructor invocation new ListStack() somewhere in the code, tying it to a specific stack implementation. And push() actually loses some important information: there's nothing saying that the Stack it returns must be the same implementation. The version I gave in TypeScript earlier is slightly better (thanks to TS having tuples), but still suffers from most of these problems.

Here's the correct translation. We need to use the Factory pattern to parameterize over implementations of mkEmpty(), a type parameter to tie the factories to their corresponding Stack's, and something called F-bounded polymorphism to make sure the Stack's going out and in have the same implementation.

interface StackFactory<S extends Stack<S>> {
  S mkEmpty();

interface Stack<S extends Stack<S>> {
  S push(int x);
  Pair<S, Integer> pop();
  boolean isEmpty();

So now we have some Java idioms — the Factory pattern and this funky S extends Stack<S> business — getting in the way of just trying to explain how a program can use a stack without knowing which implementation it is. If you had to give this lesson to sophomores, you might be tempted to do it in OCaml to avoid this nonsense, even if you don't know OCaml.

Maybe we can teach it in Haskell, a language with far more industrial use than either OCaml or SML. Haskell's typeclasses do make it straightforward to write an interface for many different stack implementations.

class Stack s where
  mkEmpty :: () -> s
  push    :: (Int, s) -> s
  pop     :: s -> (Int, s)
  isEmpty :: s -> Bool

And yet, even this invites confusion about the relationship between implementation and interfaces. Typeclasses may be cool, but they come with a dark side. Consider the Read typeclass:

class Read a where
  read :: String -> a

With Read, the compiler will automatically pick the right read function based on how the result is used. So if you type read str + x, where x is an Int, then it will run an integer-parsing routine on str. But if you type getEmployeeName (read str), then it will pick and run an Employee-parsing routine. The downside though is that, without doing some super-advanced stuff, there can be only one such read function for each type. If you want to have two different ways of serializing Employee's, then, sorry! Go back to having separate readEmployeeFormat1 and readEmployeeFormat2 functions like a pleb.

Back to stacks. Say you had a kind of list data structure that can add and remove items from the end just as easily as it could from the beginning. There are two obvious ways to implement a stack using it: by adding and removing things from the beginning, or by adding and removing them from the end. 7

We'd like to explain that there can be multiple ways for this double-ended list to implement the stack interface. The best demonstration would be showing both ways in a single program, but we can't do that. Students are at risk of walking away thinking that “implementation” just means “representation.” For the language cannot express the concept being taught.

So when Harper says that the only languages they could use to teach these concepts to undergrads were SML and OCaml, he's not just being a snooty type theory professor. He's being a snooty type theory professor with a point.

Not Just for ML Programmers

Understanding real modules is worthwhile even if you never intend to work in a language that has them. Modules are fundamental, and so much of what is done in other languages, from build configurations to dependency injection to the Adapter pattern, are in large part an attempted encoding of things easily expressed with modules. To write namespaces in other languages worthy of being called “modules,” you need to understand what representational secrets are shared internally and hidden externally. To properly parameterize a Java program over which stack implementation it uses, you need to do dependency injection on the StackFactory type defined above, something which is much easier to do if you first think about how to do it with modules and then translate.

Modules have so many little pieces which are important in programming and yet which few programmers have encountered except in obfuscated form. I naively thought when I started writing this that I could do it in under 1000 words, but I'm rapidly approaching 3x the length of the original. Harper will throw out a sentence like “It is absolutely essential that a given module satisfy many distinct types , without prior arrangement” and I'll take a page to explain what that means. It's a fundamental idea, and it's trivial in OCaml or SML. It's a tragedy that, to most, it's so alien.

After understanding this post, I encourage you to have a read of the original. I believe I've captured everything important, but Harper has his own unique fire, and he's mixed in some tasty rants about the “machine vs. language” models of code and teaching parallel programming as the default.

Understanding modules fits into a larger mission. Programming shapes the world. But, as programmers, programming languages shape our lives. I'll let Harper himself finish. 

Dijkstra used to say “beauty is our business”, to which I would add that life is too short, and bright minds too precious, to waste on ugly things.  And if you take this point of view seriously, the best and the brightest are drawn to, rather than repulsed by, the field.  [...] students absolutely do love to program and do love the beauty of code, provided that the code you ask them to write is, in fact, beautiful.  There is little more dreary than the corporate bureaucracy of OOP, and few things more lovely than the mathematical elegance of FP.

1 Java has a seldom-used “package-private” visibility specifier which fits the bill a little bit, but stops working once you get a folder of 100 classes and you decide to move a few into a subdirectory.

2 Another name for module-valued functions is “functors.” This is really confusing because “functor” has a grand total of four meanings in programming, each completely different from the others. See Wikipedia's disambiguation page.

3 If you've heard of the Curry-Howard “proofs-as-programs'” correspondence, this is it. Every program's type encodes a theorem, but usually they're not interesting theorems.

4 I actually initially wrote this post using SML, but changed it to OCaml purely because it uses the keyword “module type” instead of “signature.” I want to emphasize: the signature of a module is its type.

5 A salient example: In Objective-C, it’s common to pass an object as its own listener or observer.

6 This would also work in TypeScript, thanks to its structural subtyping.

7 This may be familiar to JavaScript programmers, where there are two ways to treat an array as a stack: using shift()/unshift() and using push()/pop().

Thanks to Alex Rozenshteyn, Emmanuel Genard, Tikhon Jelvis, and William Gottschalk for comments on earlier drafts of this post.

Appendix and FAQ

Nils Eriksson, who previously shocked the world by getting TypeScript's type-inference to play tic-tac-toe, has taken up the challenge of building secret-sharing in TypeScript. It's unusual for TyypeScript, but it has me thinking that maybe this should become a well-known pattern.

Q: You seem to undersell typeclasses. Typeclasses get you 90% of the way to modules, and Haskell's newtype covers the other 10%. Meanwhile, Modules and module-valued functions (functors) may have great theoretical advantages, but they have way too much boilerplate to take advantage.


This is a "Haskell inside baseball" answer and likely a difficult read for anyone not a serious Haskeller. Original context: here.

Yeah, this post probably does sound like it was written by an ML afficionado. It was actually written by a hardcore Haskeller of over 10 years who was even on the PC of the Haskell Symposium last year, although I was explaining a blog post by an SML maximalist.

But actually, the lack of ML-style modules has been at the top of my Haskell wishlist for about as long as I’ve been using it, next to the ability to have finer-grained control of typeclass instantiation at use-sites. I was pretty disappointed by Backpack for that reason; they missed their chance to fix this problem, and Backpack is still little used a decade later.

A big difference between typeclasses and modules is that typeclass instances are chosen per function and need to be specified. Module implementations can be chosen at the top level.

So here’s an example of where I want modules: https://github.com/jkoppel/ecta/blob/main/src/Data/ECTA/Internal/ECTA/Operations.hs#L73. This for my implementation of equality-constrained tree automata (ECTAs), from my ICFP ’22 paper Searching Entangled Program Spaces.

I have two implementations of the memoization library, and expect at some point to add a third based on concurrent hash tables. I switch between them by commenting out an import statement at the top. There is nothing in the type system that forces these two modules to have the same interface. If I wanted better support for switching between them, I would add a build flag and #ifdef’s.

How would I do this with typeclasses? The normal way is to add a typeclass constraint to every single function that needs it. I would go further and add said parameter to many that don’t, because it’s an implementation detail whether a given operation is memoized, and interfaces should be stable. Now my choices are to either wrap each ECTA in an ECTAUsingSingleThreadedMemoization or ECTAUsingConcurrentMemoization newtype, to add a new proxy parameter to every function, or to go full on AllowAmbiguousTypes and force every caller to give a type parameter specifying the memoization library used. Also, did I mention that memoization libraries are based on global state (either “not very Haskelly” or “extremely Haskelly,” depending on your perspective), and that if you supply different instances at different call-sites, the code will function fine, but have a silent performance bug. This puts a burden on the caller to maintain consistency in which the easiest solution is for the caller to….use a module system! The final option is to make the ECTA data structure itself have a phantom type parameter used to select the memoization implemenatiton. E.g.: instead of passing around an ECTA, you always pass around an ECTA Concurrent or an ECTA SingleThreaded.

In any of these, you are exposing implementation details to callers, or even to transitive callers. They are based on universally-quantified types, when I need existentially-quantified types, synonymous with information-hiding.

How to get existentials? I can go object-oriented. In Haskell, that can look like stuffing a typeclass instance inside a GADT. But that doesn’t work for binary functions; I wouldn’t be able to guarantee that, when intersecting two ECTAs, they use the same memoization library (and they would thus use different global state when they need to use the same).

But, yeah, I totally agree about the poor ergonomics of actually writing all your code as a module-valued function (functor) over all its dependencies. I think this could be fixed rather easily using a new kind of import or open statement, but the MLers haven’t yet.

Tuesday, December 6, 2022

Book Review: The Senior Mindset

Software engineers are continuously improving in ways deep and shallow. The easiest things — surface knowledge like how to use git bisect or how to file an expense report — can be taught by straightforward practice or even by post-it notes. The hardest things — the assertiveness to say no, and the confidence to admit when you don't know — are foundational changes thought to require years of reinforcement.

Apparently though, there's a faster option: read Swizec Teller's The Senior Mindset. The Senior Mindset is a 300-page “raw and honest from the heart” treatise on the thoughts and habits of being a senior engineer. Alternatively, it's a 100-page collection of newsletters expanded with several flocks of embedded Tweets and a galactic amount of empty space.

Teller has come up a lot on my radar lately, and I've started reading his newsletter. I found them off-putting at first, particularly in the way they say my name and “my friend” constantly like someone who got his sales training from The Wolf of Wall Street. But after I got used to the style, they started to leave me smiling.

What reading this book is like

So I knew I'd get some enjoyment from his book, though my expectations were not high for learning much. How'd it turn out? Midway through, I found the one passage that exemplifies the entire book:

When your PM says “Send user a message at 9am”, she doesn't know to also ask:
  • What happens in different timezones? 
  • What if our server is down at 9am? 
  • What if the messaging service is down? 
  • What if the database was corrupted? 
  • What if the cronjob runs at 9:01am? 
  • What if it runs at 8:54am? 
  • What if we have to send 1000 messages and that blows through our rate limit? 
  • What if sending 1 message out of 100 fails? 
  • What if we want to analyze sends later?

If your reaction to this list is to nod along, you'll probably love it. But if you think “Actually, I think [name of PM] could do 5/9 of these” and that thought actually bothers you, then, well, prepared to be bothered.

Teller is prone to bold sweeping proclamations. One thing he likes doing is giving simple prescriptions to problems without acknowledging the factors and tradeoffs. “Old code that isn't shitty was over-engineered the first time around,” he says, but I expect he'd change his tune if asked to build a gas leak sensor. A common pattern comes in him attacking a common oversimplified, wrong belief by presenting a similarly oversimplified, less-wrong belief. He counters the fallacy of engineers always selling great new features, but commits the opposite fallacy of assuming all buyers have the same profile (the Merril-Reid model is popular for this). He counters the thing about CS degrees being useless compared to software engineering classes, but he makes the opposite fallacy of invoking magical thinking about how it's helpful, as opposed to a gears-level model. In each of these cases, I don't know if he actually believes the simplified version, or he's just saying it to sound cool.

But occasional details lead me to believe that he is knowingly oversimplifying. For instance, one third of the way through, in “Tech Debt is a Tool,” he boldly proclaims without qualification that tech debt is a secret weapon that lets teams go faster. But all the examples are not tech debt but rather examples of not implementing unnecessary features, and I'm left wondering if he knows what tech debt is. Hundreds of pages later, in “When do you fix tech debt,” he tells a story of a time when attempted tech debt turned out worse than a payday loan, and I see that he clearly does. Overall, I enjoyed the book more when I stopped thinking of Teller as an advice-giver and instead looked at him as a motivational speaker.

In that regard, Teller is like an anti-Dan-Luu. They both write about being successful inside a larger software organization. But if they were both writers for the same business, Teller would be writing the marketing copy, while Luu would be writing the S-1. For instance, Luu's Big companies v. startups is full of calculations of compensation and specific stories from different companies, while Teller's “Should You Work at a Startup” essay gives a few generic complaints that people might experience — and is shorter than just the footnotes of Luu's post. Teller's “Why engineers are worth so much” explains that engineers can produce products worth a lot of money in short time, while Luu's “Why are programmers well paid” discusses multiple competing explanations based on productivity and hiring markets. Also, Luu's piece is over four times longer than Teller's, and is actually the appendix to a greater article. You can play a game of, for each Teller essay, finding a Dan Luu post discussing the same thing with a deeper analysis. The caveat though is that the longest Dan Luu post is the same length as the entire book.

Did I learn a lot about being a software engineer? No. Did I have fun? Yeah. But hey, it's a plane read. So, really, my only complaint is that it didn't last the whole ride.

Overall status: Not recommended

The Senior Mindset is fun but trite. If you skip it, then you're not missing out. But it's short, so, whatever thing you're not doing by spending the time to read it, you're also not missing out.

I personally have a lot to learn from Swizec. He's been a successful influencer and has a mailing list with 10x more readers than I. In fact, I've already started working down his recommendations for books and videos, and even contacted the editor he recommends. And I think people new to the Valley or industry can learn a lot from him, but it's not clear how much others have to learn from him.

In short, The Senior Mindset is popcorn. It's light, fluffy — and it even has some nutrition.

Quotes from The Senior Mindset:

Congratulations, you just became Lead Engineer on a project. You're nobody's boss and it's your fault when something goes wrong.

As the old proverb from the Balkans says: Budala pamti, pametan piše. The fool remembers, the clever one writes.

Trust is best built by not fucking up.

Friday, September 30, 2022

How an Ancient Philosophy Problem Explains Software Dependence

Update Oct 30, 2022: Professor Daniel Jackson wrote a response to some of the criticism this post experienced on Hacker News.

This post is based on a research paper at Onward! 2020 and loosely follows the talk I gave at that conference.

Have you tried to remove a dependency from a project and been unsure if you have successfully done so?

To “invert a dependency?”

To keep the core of a codebase from depending on something specific?

Or to “depend on interfaces, not implementation?”

If you’re a programmer, you probably have. If you’ve been doing this for a while, there’s a good chance you’ve gotten into a disagreement about whether you’ve successfully done so. Dependence is one of those words where everyone thinks they know what it means (Clean Code uses the word over 100 times without defining it). But as easy as it is to speak broadly about limiting dependencies, uncertainty about what it means leads into uncertainty in actual jobs.

By the end of this post, you will learn an objective definition of dependence for software engineering. This definition encompasses all places where engineers talk about dependence, from package management to performance engineering and even usability, and it’s sufficiently rigorous to be mechanically checkable.

By “mechanical,” I do not mean automated, nor even easy. It’s relatively easy to write a function where deciding whether it can interact with another function requires solving a famous unsolved math problem. But with this new understanding in hand, and some amount of thinking, next time you have to ensure that customer privacy does not depend on a third-party library, you should be able to explain exactly what that means and what must be done.


   A dependence question raised in "Oracle v. Google"

We programmers talk about dependence all the time. Like, “my package has Postgres as a dependency but it doesn't depend on it until it's used, except then it still never depends on it because I'm database-agnostic.” If you ponder that sentence, you may realize that every use of the word “depend” has a different meaning. We know that limiting dependence is a good thing, but without really understanding what it means, it's advice doomed to break down — citrus advice.

A few years ago, Daniel Jackson and I set out to figure out what “dependence” really means in software. I daresay we were extremely successful. We have a general definition that specializes to each of the varieties of dependence hinted at in the previous paragraph, and then some. With it, you can definitively (though not always feasibly) answer whether you've managed to remove a dependence. Trying to apply the definitions raises a lot of subtlety, but we found we can address them by stealing algorithms from a certain subfield of (wait for it) computational philosophy. We first published our findings in a paper at the Onward! 2020 conference, where it was greeted enthusiastically by peer reviewers, becoming the only paper that year accepted without revision.

   Image by DALL-E 2, “complex clock mechanism hidden in mist”

This blog post is loosely based on my previous effort to distill our findings, my Onward! 2020 talk. Loosely. The words here may occasionally match what is said in the video, and the numbers next to the slides illustrations below will usually go up. The Onward! conference is a funny place, full of people who fly around the world to learn about the latest in static analysis and type systems, but then ditch that go hear about pie-in-the-sky ideas that don't work. (Though it's also the home of some of my favorite software-engineering essays ever.) I will make no assumption in this blog post that you've ever had reason to care about “static and dynamic semantics.” On the other hand, you might follow the example better than the original audience if you've been exposed to an “analytics framework” by means other than through the window of an ivory tower.

Anyway, let's go learn about dependence. First: why it's hard?

Depending on Analytics

As a running example, I have an iPhone app. It uses an analytics framework, but it doesn't use it directly. I've interposed an analytics abstraction layer, putting some kind of delegator between them. Now my question is, does the app depend on the analytics framework?

I'm trying to argue that the world didn't previously understand dependence, so I'd like to find some contradictory statements about dependence in well-known books. Problem is, these books mention dependence all the time, but rarely say anything concrete enough to find a contradiction. Still, I found this:

Looking at the quotes in the slide, Martin Fowler seems to think that having this interposed delegator removes the dependence. I believe Robert Martin is saying that if something important depends on my choice of analytics framework, then it should be referred to directly in the program text. But my first reading suggests something in tension: one saying that the analytics abstraction layer removes the dependence and hinting that's a good thing, and the other stating that it's bad. This is a pretty interesting contradiction from Robert Martin Fowler, who are quite close as far as software engineering gurus go. Of course, I'm not sure what they're actually advocating, which is part of the problem with existing terminology.

Branching out a bit, I'll let you make your own judgment on the following quote.

In the old Web site with the background specified separately on each page, all of the Web pages were dependent on each other.
—John Ousterhout, A Philosophy of Software Design

(Ousterhout did remove one misuse of “dependence” in the newer versions after I pointed it out. But I found more. Note that I still consider it a good book, even though I've criticized it a lot.)

To really draw out the confusion and mist around the idea of dependence, we prepared nine dependency puzzles.

The 9 Dependency Puzzles

  1. The Framework Puzzle: We usually think that, if A calls B, then A depends on B.

    But consider a round-robin scheduler: it invokes every task to be scheduled. Does it depend on each of the tasks?

  2. The Shared Format Puzzle: If A writes a file and B reads it, they are clearly coupled because of the shared format. But which depends on which?

    You also see this when a client sends messages to a server; if either end changes, the other will break. Do they both depend on each other?

  3. The Frame1 Puzzle: We like to think that, if A does not depend on B, then a change to B cannot affect A.

    But what if you change B in a way that introduces a new dependence, like modifying a global variable used by A?

  4. The Crash Puzzle: As a more extreme version of the previous: A change to any module can affect a change to any other module, namely by crashing the program. So, by the “A does not depend on B if a change to B cannot affect A” rule, everything depends on everything else.

  5. The Fallback Puzzle: A website loads a resource from a CDN but hits a server if unavailable; an app uses a cheap-but-flaky payment processor when it can but switches to a more established one when it's down; an analytics service tries to connect to multiple servers until it finds one that works; a media player that uses hardware video-decoding when available but software when not — these all have the structure “A uses B but falls back to C.”

    With fallback, the failure of B does not imply the failure of A. Does this mean that A does not depend on B?

  6. The Inversion Puzzle: Dependency inversion is touted for removing dependencies: instead of A statically invoking module B, the program passes A a reference to B at runtime. But A will still fail if B fails. So is there no change in the dependence structure?

  7. The Self-Cycle Puzzle: We usually think of dependence as transitive: A cannot function without B, B cannot function without C, therefore A cannot function without C.

    We also complain about dependency cycles: A depending on B and B depending on A is bad, but it can happen.

    Now, putting these together: What does it mean for A to depend on A?

  8. The Higher-Order Puzzle: Libraries should not depend on application code. But, from one perspective, they clearly do: a hash table implementation will give incorrect output if keys have inconsistent implementations of equals() and hashcode() methods (that is, if equal values have different hashcodes). Do hash tables and other library classes depend on their callers?

  9. The Majority Puzzle: There was a fad a few decades ago called “n-version-programming.” The idea of N-version programming: if 5 people write the same program, then there's no way they'll exhibit any of the same bugs, right? So you can have 5 people implement the same (very detailed) spec for a robot controller, and have the programs vote what to do at each timestep; so long as no 3 people have the same bug, it should always function according to spec, right?

    Of course, Nancy Leveson showed empirically that this doesn't work and the fad died, but it presents an interesting hypothetical: Suppose all 5 controllers implement the spec perfectly, and thence all 5 always vote for the same action. Then no change to any single implementation can alter the robot. Does this mean that the robot does not depend on any of the controllers?

We'll be answering all these puzzles at the end. But, coming up next: understanding why the sentence “Does A depend on B?” can actually stand for one of many, many different questions.

Types of Dependence

Let's revisit the analytics example and ask “does the app depend on the analytics framework?” There are several questions here.

Well, not quite. There's a tradition in the Jewish holiday of Passover that the youngest always chants a Hebrew verse titled “The Four Questions.” The joke is that it's actually one question (“Why is this night different from all other nights?”) with four answers. Likewise, “Does it depend” is just one question, but here's five answers, as seen in the image:

  1. No, I can build them separately.
  2. Yes, a crash in the framework will be a crash in the app.
  3. No, because I can just use a different framework that changed the code and I will still have working analytics.
  4. And no because I'll get the same resul-.....actually, no I won't. It's an open secret that no two analytics frameworks give the exact same numbers. Ideally they'll be close enough for all purposes other than choosing what to say when boasting about number of users. But maybe one will give more statistical accuracy, so, realistically, yes.
  5. Oh my god yes, because any security vulnerabilities in the framework will be security vulnerabilities of the app as well.

But yeah, these are definitely answering different questions.

Dimensions of Dependence

To make a question about dependence concrete, you must specify three parameters: the semantics or execution model in question, the property of interest, and the space of permitted changes.

Dimension 1: Semantics

Let's go back to that agonizing sentence from the intro. “My package has Postgres as a dependency but it doesn't depend on it until it's used, except then it still never depends on it because I'm database-agnostic.” Change a couple words, and it fits the analytics example as well. So I said before that this sentence uses “depend” in three different ways2. It's not just that they're answering different questions. These are actually different models of execution.

  1. When talking about importing a dependency, the “execution” engine of interest is the compiler or build system: is the program even well-formed, or is there a build error? This roughly maps onto the more theoretical term “static semantics” or “type system.”

  2. When saying that actually invoking a package is what gives a dependence, the execution engine in question is an interpreter, or the combination of the compiler + a CPU. The focus is now on whether the code actually runs and contributes to the overall program output. “Dynamic semantics” is a general theoretical term for what actually happens when running.

  3. When saying that a program would work even after switching to a different library, the “execution engine” in question is actually the reasoning used to convince yourself or someone else that it's correct / achieves its goal, based only on the guaranteed properties of the analytics abstraction layer. If you buy that that human reasoning is analogous to automated reasoning, then the theoretical term that captures this is the “program logic,” the proof system used to show the program achieves its intention.

This lists just 3 possible execution models, but there are infinitely many more, a few of which are discussed in the paper. For example, the property of whether text is readable could be phrased in Visual Logic.

We can already start to categorize the answers about the analytics framework accordingly. Inability to build separately refers to the well-formedness / static semantics. When a crash in one is a crash in the other, that's the interpreter / dynamic semantics. The ability to swap it out and still have “working analytics,” that's the correctness argument / program logic.

Of course, the last two answers, about accuracy and security, also sound like correctness arguments. So it's not enough.

Dimension 2: Properties

You might notice from the last answers about whether it's secure and whether it has working analytics: those are actually different properties.

You should never ask, does module A depend on the module B? But rather ask, does some property P of A depend on B?

We can now classify the answers better by looking at the property. It does depend in the well-formedness property, does depend on runtime behavior, does not depend on correctness, does depend on statistical performance, and does depend on security.

But there's a tension here. I'm saying that there's no dependence in whether it works, except that I can hack into some versions and not others and make them not work. So maybe correctness does depend? We'll address that soon.

In the meantime, we're on our way to resolving some of the dependency puzzles puzzles. Back to the Framework Puzzle: Does a round-robin scheduler depend on the individual tasks? It clearly does for many runtime properties. But for the correctness of the scheduler, it shouldn't....

...unless it's a non-preemptive scheduler and a task doesn't terminate and hogs all the scheduler time. So yeah, maybe.

We've now seen two examples where we want to say that there's no dependence on a component, except that component can “break out of the box” and mess up the rest of the system. That's where the third dimension comes in.

Dimension 3: Permitted Changes

A property of dependence we'd like to have: if A does not depend on B, then no change to B can affect A. Actually, this sounds like it would make a nice definition. When Dale Carnegie says “happiness doesn't depend upon who you are or what you have; it depends solely upon what you think,” he seems to be saying that merely changing whether one owns a Lamborghini would not alter whether they have the property of being happy; they must also change their belief about whether they own a Lamborghini. But we've seen this apparently violated in the Crash Puzzle described earlier, which can be instantiated to this scenario quite literally: owning a Lamborghini but believing otherwise might get you in a fatal crash, which is detrimental to happiness. More generally: how can we say “no change to B can affect A” when one can “change B” by replacing it with a program that destroys A.

So we're left with this tautological definition: if A does not depend on B, then no change to B can affect A, unless that change introduces a dependence. It's easy to say “we should just exclude these kinds of changes. Like, Signal should be able to say they don't specifically depend on Twilio to work, as opposed to anything else they could use for sending SMS. But they announced a Twilio-based security incident while I was writing this, so sometimes these kinds of changes really matter.

So, here's an idea: how about sometimes we exclude these kinds of changes, and sometimes we don't?

Or, in other words: every conversation about dependence implicitly carries a reference frame about which kinds of changes of reasonable to consider. When you're asking about whether your app's stability depends on an analytics framework, you should get a different answer based on whether you're interested in whether an API change can cause your code to raise an exception versus whether you're wondering whether someone can exploit a vulnerability to brick your customer's phone.

After the execution model and the property of interest, this “space of reasonable changes” is the third parameter determining a dependence question. Sometimes this is a spec: the mantra “depend on interfaces, not implementations” seems to mean that, if your program uses a function, then any change to that function's implementation within the space of “changes that don't alter the spec” should not change your function's correctness. So you could call the “space of reasonable changes” a spec. But that would be misleading, because if you want to ask “Could the library authors mess me up by removing this guarantee,” you are actually asking about possible changes to the library's already-existent spec. In that case, the space of permitted changes is actually a spec on the spec, and so the best name we came up for this is “super-spec.”

Now we can totally resolve the Framework Puzzle: Forbid tasks that don't terminate, and there's clearly no dependence. And the Crash Puzzle: Forbid changes that make the component crash, and there's no dependence.

But there's still the Fallback Puzzle. Based on what we've said so far, we really have no clue. Intuitively, I'd want to say that my payment succeeding depends on the payment processor that was actually used. But if the only thing that a change to that processor can do is bring it down, then no change to it can stop my payment from going through, because the system would just use the other processor. So now it sounds like, if your system has a fallback component for everything, then it has no dependencies!

More abstractly, we're asking questions like “I used it, but I had another one, so did I really need it?” If you're thinking that this sounds like something philosophers ask, then you're right.

Dependence is Causation

So I foreshadowed this one by suddenly talking about happiness and Lamborghinis: everything I've been talking about can also apply to everyday things in the physical world. In fact, all of the dependency puzzles are isomorphic to a physical causality question.

The Framework Puzzle with the round-robin scheduler and how long tasks take? That's a lot like stacking boxes; the contents of the box don't matter unless they're really heavy. The Crash Puzzle, where everything depends on every other component because they can crash? That's a lot like saying that your ability to read this right now depends on what every single one of your neighbors is doing, which is, apparently, not cutting your electricity. And the Fallback Puzzle? That's a lot like asking if the grass being wet right now is caused by it just having rained when the sprinkler was programmed to come on if it didn't.

More specifically, there are actually two categories of causality. There's “type causality" or “general causality,” which deals with general statements about categories such as “A stray spark causes fire." And then there's “actual causality" or “token causality,” which deals with statements about specific events, such as “A stray spark in Thomas Farriner's bakery on Pudding Lane caused the Great Fire of London on September 2nd, 1666." As dependence deals with questions about specific programs and specific scenarios, actual causation is the one relevant here.

There have been several proposed formal definitions of actual cause, and there is no standard for determining whether a definition is correct—only arguments that it is is useful and produces answers that match intuition. The basic one is the “naive but-for” definition: A causes B if, but for A having been true, B would not be true. This is the definition used by lawyers, and is the reason you can be charged with murder if a cop falls to their death while chasing you, because you “caused” their death. But when applied to the rain and sprinkler, it tells you that the rain cannot be a cause of the wet grass, which is counter to intuition.

But maybe that intuition is because rain and sprinklers are actually pretty different. There are isomorphic questions, such as “Is my apartment cool because I turned on the AC at 6:58 when it was scheduled to turn on at 7:00,” for which the intuitive answer is no. Similarly, my intuition says that my credit card succeeding did depend on which payment processor was actually used. But I can't be so moved to care which of many redundant servers my browser is connected to when writing this. So while much of the literature on actual causation examines the shortcomings of the “naive” but-for definition and focuses on examples like the sprinkler, we've decided that the but-for definition is actually pretty good, and the problems mostly come up when theorists play with models of rain and sprinklers that ignore their differences.

This means that I get to spare you the debate about the definitions of actual causation. If you really want it, you can have a look at our paper for a brief intro. For more, well, Joseph Halpern has an entire book on this subject.

Anyway, this leads to the finale, a general definition of dependence:

A property P definable in some execution model of a program depends on component A if and only if A is an actual cause of P, with respect to a space of permitted changes to A.

Using the but-for definition of causation, this becomes:

A property P definable in some execution model of a program depends on component A if and only if there is a permissible change to A which alters whether P holds.

Puzzling No More!

And now, final definition in hand, we can go back and definitively answer all of the dependency puzzles. And by aping the causality literature, we get some new concepts for free. For some questions, we get the answer “it's not a dependence, but is part of the dependence,” stealing the concept of “part of a cause.”

  • The Framework Puzzle: Asking about dependence of a specific property resolves the confusion. It is true that the runtime behavior of a round-robin scheduler, particularly timing, does depend on the tasks being scheduled. But its correctness does not.
  • The Shared Format Puzzle: There are actually two resolutions to this puzzle. For the first, we again look to a specific property.

    The serializer/deserializer should satisfy a round-trip property, that serializing a value and then deserializing it results in the original value. This round-trip property clearly depends on both. Through another lens, we can define correctness of each end separately, in terms of an abstract description of the format. Then the correctness of each depends on the shared format, but not on the other end.

    In both cases, the serializer and deserializer do not depend on each other, though they are coupled. 
  • The Frame Puzzle: Super-specs resolve this puzzle! If it is reasonable for B to modify a variable used by A, then A already depends on B, and a programmer would indeed need to read the source code of A to check that it does not interfere with B.

    But if one imposes a super-spec on B forbidding it from modifying such extraneous state — in mundane terms, if your tech lead would yell at you for changing it to do such mutation — then the dependence, and with it the need to read, disappears.
  • The Crash Puzzle: This one's just an extension of the previous. Super-specs again come to the rescue. If you consider crashing changes to any module, then indeed the runtime behavior of every module depends on every other module. But if crashing changes are out of consideration, then this is not so.
  • The Fallback Puzzle: These fallback scenarios are isomorphic to examples from physical causality, such as “If it doesn't rain, the sprinkler will turn on [ergo, the grass-watering procedure falls back to a sprinkler].” And so, concepts from causality dissolve the confusion. There are two resolutions.

    The first: a dependence doesn't have to be a single module! We can say that the only reasonable change to the CDN (i.e.: its superspec) is to knock it offline. Because of fallback, that can't prevent an image to loading, so the image loading indeed does not depend on the CDN. But simultaneously knocking the CDN and the fallback server offline can! So, the dependence is actually on the set {CDN, fallback server}, and the CDN itself is part of a dependence.

    But another resolution is to ask: is it really the same to use B vs. falling back to C? In the case of loading an image from a CDN, perhaps not, and it's intuitive to say that you loading a website just now did not depend on its CDN. But for receiving a payment through SketchyCheapProcessor vs. StodgyOldProcessor, it would be an expensive mistake to treat them as the same!
  • The Inversion Puzzle: Thinking about the lens of multiple tools resolves this one. To the interpreter, even after doing dependency inversion, there is a dependence (i.e.: runtime behavior does depend). But to a compiler or typechecker, the dependence has been removed (i.e.: well-formedness does not depend). 
  • The Self-Cycle Puzzle:

    Insights from the causal-inference literature resolve this one as well. Surprisingly to many, causality is not transitive, and hence dependence is not either. Simple, if facetious example: Seeing an advertisement for pizza causes me to press buttons on my phone. Me pressing buttons on my phone causes your phone to ring. Me seeing an advertisement for pizza does not cause your phone to ring (as it causes me to press the wrong buttons). Discussing exactly when and why causality is not transitive can get pretty technical, and is sensitive to exactly how one translates a statement into formal language. (Indeed, you may have already prepared a valid reason why my pizza example should not count.) If you're interested, I'll refer you to Halpern's “Sufficient Conditions for Causality to Be Transitive.”

    Of course, many kinds of dependence are transitive, such as build dependence. But this does mean that you don't need to bend your brain around self-dependences; just use the definition in this article.
  • The Higher-Order Puzzle: Actually stating the property of interest resolves this one. Every function's correctness is defined with a precondition: if the inputs are valid, then the outputs are as desired. In the case of a hash table, the input includes the keys' equals() and hashcode() functions, and these are only valid if their implementations are consistent. If you build a hash table on keys whose equality is determined by coin-flipping, then getting bad output is not the hash table's fault. So the hash table's correctness does not depend on any specific implementation of equals() and hashcode(), but its runtime behavior sure does.
  • The Majority Puzzle: This is another puzzle ripped straight out of the causality literature, taken from examples on (wait for it) majority vote! The idea of being “part of a dependence” returns to resolve this one. Indeed, the robot's behavior does not depend on any single one of the 5 controllers, but it does depend on each “winning coalition” of voting programs; that is, each subset of size 3.

And now we can go back and un-confuse the Fowler and Martin quotes from the beginning. Fowler is talking only about static/build dependence. Martin Fowler is talking about how inserting your abstract delegate factory removes the build dependence. Robert Martin seems to be saying that if the correctness property depends on which analytics framework you're using, then you shouldn't be using this abstract delegate factory. You should actually have direct reference textually to make the well-formedness depend. Now we can see this piece of advice are actually not in contradiction, that it can still be good to insert this analytics subtraction framework when the choice of analytics framework does not matter, when the correctness property is not affected by the choice. But when you are relying on a single choice, then: not such a good idea.

Coupling vs. dependence

I mentioned coupling briefly above. Where does coupling fit in? I think this story about the connection between causation and dependence also explains coupling: dependence is to causality as coupling is to correlation.

A common view in causality is that the connection between causality and correlation is given by Reichenbach's principle: that if A and B are correlated (i.e.: knowing one gives some information about the other), then either A causes B, B causes A, or there is some common cause C. As an addendum, correlations can also be observed if one limits observations to situations where some common effect of A and B occurs. For example, if you have a bookshelf, you'll likely find that the more useful books tend to be less interesting — not because of some intrinsic trait, but because you probably don't buy books which are neither interesting nor useful. This is Berkson's paradox.

I propose that there is a software analogue to Reichenbach's principle: that if A and B are coupled (which I take to mean: tend to change together), then either A depends on B, B depends on A, or they have some common dependency. And there is a software analogue of Berkson's paradox too: if one is trying to preserve some property that depends on both A and B, then a change to A will be accompanied by a change to B, as in the case of changing a serializer while trying to preserve the round-trip property.


Dependence is something everyone assumes they know, but that understanding breaks down quickly. It may be pretty cool to connect dependence and causality, but what does learning this mean for actual programming? I say there are three benefits, although they can be said to all be part of the same benefit.

The first benefit: it ends debates and clarifies confusing discussions. See our alteration to the Martin Fowler and Robert Martin quotes above.

The second: it paints a target. When I started the Thiel Fellowship in 2012, I had to write a letter from my future self in 2014 explaining my accomplishments. I stated that I would have a tool that could globally chase down every place in the code that depends on some assumption and remove that assumption. It's 2022 and I still don't have one, but at least the problem is defined now, so I know where to begin.

And, of course: knowing what dependencies are helps you remove them. For example, Parnas teaches us to not let general things depend on specific things. Just as there are ways to trick yourself into thinking you've removed an if-statement without really doing so, there are ways to trick yourself into thinking you've removed a dependence — say you only removed a static dependence, but you actually wanted to remove the dependence of some correctness property.

And there's one dependency I especially want to remove: the dependency of the ability to produce quality code on indescribable experience and pattern-matching.

I'm thinking of a story from a recent student of my Advanced Software Design Course, whom I'll call Karl. Karl is the lead software engineer of a startup in Northern Europe. He noticed some design flaws in some code from one of his junior engineers, and started to refactor it, changing 1800 lines across 10 PRs. Hoping to make it a learning experience, Karl walked the engineer through it until Karl thought the engineer understood, so that he could finish it himself. Karl assigned the rest of it to the engineer, and then came back to find...

...the code had been un-refactored into the original form.

What happened is that Karl built understanding of what was wrong with the original code and good about the refactoring via a mechanism whose effectiveness depends on building an intuition from a lot of experience. Although it served Karl well as an engineer, it failed him as a lead; it was not transmittable. That motivated him to come to me.

When a concept teaches you how to identify the best design decision and explain it, it's obviously useful. But I think there's a subtler benefit that comes from understanding your actions, like going from following recipes to being a chef. I hope this understanding helps your subjective experience of programming move from climbing a mountain to enjoying the view at the top.

1 “Frame” is a term used in program analysis to mean “the set of values a code snippet may modify,” and more generally in AI and philosophy to mean “the set of beliefs that may have to be updated in response to an action.” See Wikipedia or Daniel Dennett’s delightful essay.

2 If you've read The Three Levels of Software: The dynamic semantics here correspond to both the runtime and code levels in that blog post, depending on whether you're considering one execution vs. all possible executions. The level of modular reasoning maps perfectly onto the “program logic” model here. The static semantics here was not discussed at all in that blog post, because there's no need for new terminology to discuss build errors.

Thanks to Nils Eriksson for comments on earlier drafts of this blog posts. Thanks to Daniel Jackson for being a great coauthor helping me create the original content, and to the anonymous reviewers at Onward! 2020 for their feedback helping us to refine it.