Saturday, December 14, 2019

My Interview on CoRecursive: Advanced Software Design with Jimmy Koppel


A few months ago, I sat down with Adam Gordon-Bell of CoRecursive to share my thoughts on software design and self-improvement.

It's actually a bit funny how I found him. "Corecursion" is an advanced programming term that to recursive or iterative programs which are structured around the shape of the output, rather than the shape of the input.  It's connected to the theory of data vs. codata, which explains why interactive programs like an HTTP server look very different from batch programs like a compiler. The canonical example of codata is infinite streams, which need very different programming techniques than in-memory lists.

So, I had been preparing to teach a lesson on codata to a group of my advanced students, when I discovered a podcast by that name, which would be an excellent fit for the topics I write about. And indeed it was. Adam was a skilled podcast interviewer, and managed to elicit a lot of information from me on topics ranging from program synthesis to deliberate practice.

Original podcast link is here. I've created a transcript below.

Topics Discussed

  • Why is Designing Software Hard?
  • Software Intentions
  • Building non-fragile software using assumptions
  • Program Synthesis
  • Program Search Space and Abstraction Barriers
  • Logical Level
  • All Possible Changes to Code
  • Code reviews and Stable interfaces
  • Better Refactoring
  • Don't Don't Repeat Yourself
  • Names matter
  • Getting stuck in a design
  • Code quality is not about code
  • Understand the code that is not there
  • Proof level
  • Software Maintenance
  • Cross Language Refactoring
  • 20 Under 20
  • Benjamin Franklin Method of Programming
  • TOPGUN Programming And Deliberate Practice

Transcript


Jimmy:
I'd say that learning to read this logical level, it's kind of like being able to detect pitch in music, in that everything of software design requires seeing it.

Adam:
Hello, this is Adam Gordon Bell. Join me as I learn about building software. This is CoRecursive.

Adam:
That was Jimmy Koppel. He's a super interesting person. He's working on his Ph. D. In the field of program synthesis at MIT. He was previously paid a hundred thousand dollars to drop out of university by Peter Thiel, but he still graduated somehow.

Adam:
The most interesting thing to me is Jimmy's working hard to teach the world how to design better software. And I think he has a really interesting perspective on this. I guess my summary is after spending a lot of time writing programs that write programs, which is what program synthesis is about, he developed some unique insights into what makes software good, what makes it bad, and he spends his time trying to teach people these insights.

Adam:
So Jimmy, thanks for joining me on the podcast.

Jimmy:
Yeah, it's good to be here.

Adam:
So, I feel like you have a unique perspective on software design. I'd like to ask you some questions about that. So why is designing software hard?

Jimmy:
Why is designing software hard? Why is doing anything hard? So rather than jump to that, I'll tell you, there are certain paths in software that are automatic, and there's certain ones that require creativity.

Jimmy:
So, imagine you're writing a system that inputs from one thing and outputs to another. Very often the nuts and bolts of that thing can be done automatically. So, you get the data in some format, and there's only so many things you can do with it, so it's kind of automatic. Then you have the raw data elements, you need to write some format. That's kind of automatic.

Jimmy:
The creativity enters when you have things going on in-between. So, the creative parts of software design are typically in building the intermediate representations. So the creativity happens in building a representation, designing your data abstractions, and usually the in-between parts become automatic.

Adam:
So, the in-between parts are sort of the serialization and de-serialization or-

Jimmy:
Yeah, things that break the data down, and then build it up again. Those are the glue, and they tend to follow fairly automatically from the actual shape of the data.

Adam:
That makes sense. So, do you think that this middle part, like the software design aspect of doing the middle, that that is like a well understood field?

Jimmy:
I think it's a lot more understood than most people realize. So, we still are very far from being able to automate it well. There's a lot of things that we do know how to do, and that I do teach. But part of what it makes it hard is that so much of what goes into that process, and this is something you'll hear from me again and again today, is really about the intention of how the thing is going to be used, which shapes all the ways it can evolve. And that is something which is not directly apparent from the code.

Adam:
Is intention, like the intent behind building this piece of software, is that the intention? So, let's say I work on some software that takes in a Docker image, so, binary snapshot of a Linux environment, and then it outputs a list of security problems that it found.

Jimmy:
So, part of the intention comes from how you define a security problem. Then from your definition of what a security problem, that determines what it's going to look for. So if you intend for there to be some kind of filtering where it's, say, going to not report certain boring kinds of things, or things you're not sure are a problem. And that's going to imply a totally different sets of filters and actions and extra checks that you can put in, versus if it was a purely mechanical process.

Jimmy:
So, the work of software design is in translating this high level goal of reporting security problems, down to low level details, like pointer scan for this, check that — and your high level description of this piece of software. Like it takes some Docker container and reports a security problem. That may never change.

Jimmy:
And then for each component, you might have intermediate level descriptions. And it's possible those intermediate descriptions will change as well. But the way you turn those into the actual details will change all the time.

Adam:
So, how would I use my knowledge of software design to build it in a way in which it was not likely to break in the future? I know that's a very open ended question, but-

Jimmy:
I teach my students something called the RAD process, which is actually a bit of a forced acronym, because the actual elements are ratchets, openness and, and assumptions. So, the best I came up with was: reduce assumptions, add openness, diminish complexity ratchets.

Adam:
Reduce assumptions ...

Jimmy:
So, the thing we'll talk about right now is assumptions. Something I teach a lot is about ... So, software has to be modular in order for it to be feasible to write. And that means that you can look at it as a single piece of code out of context, and talk about what it means, and what must be true about the rest of the code in the system for this to work.

Jimmy:
So for instance, every ... Say you have a point class, and you say point.X, that is already placing an assumption on the system, namely that this is a type that has a field called X, which is an integer. It's a very weak assumption. But those are the kind of assumptions you have to learn to identify.

Jimmy:
So, say you're writing your Docker container scanner, and you can look it as a random 10 line snippets, and try to state in plain English, what this is doing? Because that forces you to say something which is closer to the intention than to the details. And you have to think about how can you look at this 10 lines of code, and through whatever configurations, convince yourself, this does what you want it to do. And then those are all the assumptions that it's placing on the rest of the system.

Jimmy:
And obviously some of the assumptions have to be there. Like if I didn't want to do anything with the points, you need to know it has an X and a Y, but I don't need to know that it's stored as a tuple with X first and Y second. That's an incidental assumption. And so those are things that you learn to identify and eliminate.

Adam:
So, you look for assumptions, and then some of them will be not important to the intention of the software?

Jimmy:
Yeah.

Adam:
And those are the design mistakes, would you say?

Jimmy:
Not necessarily mistakes. This is the process you can take, you can take arbitrarily far. But those are the things that you want to keep localized. So, I'm going to run with the point example again because it's easy to talk about for everyone, even though it's kind of stupid, which is: say you have an application which has passed around X, Y points everywhere.

Jimmy:
One day, you realize that you want part of your system to support both 2D and 3D. So, sometimes these points are X, Y, and sometimes are X, Y, Z. So, the parts that actually needs to be 2D stuff, it's fine if they're taking an X, Y pair. But then you'll have ten thousand other parts of your system, like things you could have saving stuff for passing it around. And if you're actually taking them in this X, Y pair, like when you're taking a parameter you write (X, Y), that is spreading this assumption unnecessarily. And so now, when you go to X, Y, Z, that's when you have to change ten thousand places, because this assumption leaked to all this places where it didn't have to.

Jimmy:
So more realistic as an example, of this kind of phenomenon of an assumption leaking... One of my students works at a large public company, which I won't name, other than to say that I've used their software, and a lot of people listening to this have as well. They have a C++ application, and they're just trying to change the representation of strings. Long ago they could have wrapped ... They could put the string in some, behind some abstract interface. And it's C++, so there'd be zero cost of doing so at runtime, but they didn't. So now his team is putting in a hundred man hours into refactoring their list of strings, and there are several other hundred teams like his.

Adam:
Yeah, I like this point example, because the point could be in 2D or 3D space, and you're saying we can write the software so that we ... Only in the places where we actually need to reference these coordinates of the points, know about the internals of it. And I watched this talk that you did at Strange Loop about ... It was called "You are a Program Synthesizer" and it did talk a little bit about this abstraction barrier. So I was wondering what is program synthesis, and how does it relate?

Jimmy:
Yeah, so program synthesis is broadly defined as program that write programs, so, fun anecdotes: Alan Perlis, computing pioneer, he had the quote that anytime a programmer says, "I should just be able to say what I want, and the computer should give me the program," you should give that guy a lollipop. So, in the intro to program synthesis class on the first day, we give the students lollipops.

Jimmy:
Most of program synthesis — I mean, a writing program in general is hard, so most of it is about specific kinds of programs. So like, every PL conference these days, you'll see a synthesis track, and the paper's like "Synthesizing a Web Crawler from a Demonstration," "Synthesizing SQL Queries from Natural Language," "Synthesizing High Performance Container Libraries from a Normal Container Library [plus a lot of annotations]."

Adam:
It seems like writing programs is difficult. Writing programs that write programs must be quite challenging.

Jimmy:
Yeah. It's its own specialty. So, the way that I see that it, that synthesis-influenced software design, is: when you're writing a program, you want to make it easy to read. If you're designing an API, you want it to be easy for humans to use it to write new programs. How do you determine that? Can you peer into people's head?

Jimmy:
So yeah, you can go down the path of psychology of programming. It's very hard, and I don't know how far anyone's gone...but there are certain things about writing programs and looking at code, understanding what it means, which are not only about psychology, but are more fundamental to the code itself.

Jimmy:
And if there is only one..... you can prove, there's only one algorithm for determining whether a piece of code can ever return to a number greater than a hundred, you can show there's only one algorithm for that, which is not the case, but suppose that you could, then you know that human reading that would have to, in their head, be running something similar to that algorithm. And when you can say, like this is what a program means, and these are the only way to employ that meaning, to compose them and get what these two things together mean.

Jimmy:
So, we can learn about algorithms for writing code and reasoning about code, and use that as a proxy for how humans write code and reason out code. And that gives a lot of insights into the nature of programming and software design.

Adam:
What type of insights does it give us? That makes sense to me. Like at a high level, you're saying, "Oh, if we're working on writing programs that write programs, we have to come up with some understanding of program generation." So what insights have you found?

Jimmy:
So, some of the ones have to do with the search space. So, you do things this way, and you've restricted the amount of information available. Now, there are only, like, three operations that you can do in order to put these together and get results. Whereas, if you don't have this abstraction barrier, and you at any point have the freedom to go down to the raw bits, the raw characters of the string, and do all this stuff, then you have a lot more possibilities. So that's something that helps both with verification and with generation.

Adam:
In my mind it seems related to just extensive use of types. Like when people say, "Make invalid states unrepresentable," is that a similar concept?

Jimmy:
So, every type system designer, at least the kinds of types designed from the software engineering side as opposed to the pure type theory side, so, if you read a paper called, you know, like "A Type System for Concurrent blah, blah, blah," "Type System for Guaranteeing that Database Accesses have Correct Permissions," we let any of those sort of papers ... People who write them have come up with them by studying the domains, studying the kinds of programs people want to write, and usually determining something fundamental about the nature of the problem.

Adam:
Let's say that I want to get better at building software. How do I even know if I'm getting better?

Jimmy:
Say you're playing an instrument, how do you know if you're getting better at the instrument? You have to really cultivate an ear, so you have to be able to listen to yourself in slow motion and notice the little things, listen to masterful recordings. First you'll have a teacher correct you, and eventually you want to correct yourself.

Jimmy:
So, the same thing goes for any skill you're trying to cultivate to high level, and in software design I teach people all about ... Earlier, we talked about the process of turning your intentions, your high level intentions into code. And then there's also the other direction of looking at code. Discovering what facts it implies about a certain point in that program, and then what those facts apply about higher level pieces of data, and eventually all the way up to abstract statements about what the program is doing and what do you want to happen in the world.

Jimmy:
So all those extra layers between these high level intentions and the raw code, it's what I call the logical layer. I used to use the term the third level, but I decided it wasn't descriptive enough. So, I'd say that's learning to read this larger level. It's kind of like being able to detect pitched music, in that everything of software design requires seeing it.

Jimmy:
So every programmer beyond intro can look at some code, and then tell you in plain English what it's doing, at least if they wrote the code. Well, there's all these extra details about being able to say what has to be true at this point, what assumption is it making about the rest of code? Those are the extra ... What else can we change about the system that would make this not work? Those are all the extra details of what you see in the complexity of your code.

Adam:
So what is the logical level of your code or of your design? How would you phrase it? There's the logical level part of your code?

Jimmy:
No, and that's very much not, because for a given piece of code, you can give it many meanings, and they might be very different meanings, like do we say if X is greater than 65, is that a retirement age check? Or is it checking for if something's an ASCII letter? So, very different meanings ascribed to the same code. Or they might be slightly different meanings.

Jimmy:
So, say I have a robot, you know in one arm it makes coffee, and the other arm it opens my mailbox. Well, it does one of those half a second before the other. Is that part of the behavior, or is it just coincidence? So, whether you say it's part of behavior or not, those are two similar but different meanings given to the same concrete behavior.

Jimmy:
So, because there's this many-to-many relation between implementation and interface, you know a single interface can have many implementations — that people get, though people are less keenly aware that a single implementation may correspond to many interfaces. Because of that, it is, in general, impossible to determine the interface from by looking at the code.

Jimmy:
And when I say the interface, I don't mean like, in Java, you say, "Oh, I have a list, and it has an interface.... is the five methods." I mean actual facts, actual logical facts of the behavior. Like if I put something in and then get from that index, I get the thing I just put in. That's a natural fact about all possible sequences of calls.

Adam:
So, the logical level, it's above the code. It's interesting. It makes me think of sometimes if you work on an old software system, and you need to make changes to it, sometimes you have to do something like software anthropology where you have to interview the people who worked on it.

Jimmy:
Software anthropology is software archaeology. You can Google software archaeology, you'll get a ton of hits. And it's because this process about how the high level intentions are turned into high level design, and then to low level design, and then to the code....those processes are either lost or at best living in their head, or in a place you can't find in documentation. And that's what you're extracting.

Jimmy:
So, that's part of the logical level. There's couple of literary places this concept has appeared. So say you put the bits in something, what is flipping a bit doing? The most obvious thing, most concrete appearance, say, I changed a 1 to a 0 or 0 to 1, but then that has a meaning. It's like, "Oh, I changed the permissions of this file." Or I set a fact that I've already visited this square, I've already done this thing. So that's an actual correspondence, which it puts on top of that raw physical action.

Jimmy:
So this is discussed in Gödel, Escher, Bach by Douglas Hofstadter, as...we halve these numbers, how has changing this number done anything? It's because there's an isomorphism between these numbers and something more meaningful. And it's also talked about in "The Art of Motorcycle Maintenance," where you have an action like: I'm turning this screw, and that's what he calls the romantic view. In the classical view, it's that the screw is part of this engine, which is part of this drive train, which blah, blah, blah. And you have these extra layers built in your head of how these parts of the motorcycle link together, and when you're turning a screw you're really acting on those mental layers. And so the same goes for when you write your everyday low-level operations in code.

Adam:
So, is the logical layer synonymous with your earlier comments about intention? Like is the logical layer the intention of the piece of code?

Jimmy:
I'd say the intention is part of the logical layer. Really, when I talk about logical layer, it's kind of a catch-all term for a bunch of related concepts about all these things that code is doing, all these ways describe it's meaning, which are not directly apparent in the code itself. So, when I talk about the logical layer versus the code layer, it's really more of a rhetorical device. It can help stop people from talking past each other.

Jimmy:
So one person says, "I made ... I'm looking at your code, this variable is global, make it not global, that's bad." The person says, "Well, it will do the same thing, whether it's global or not, the way it's written. What can go wrong by leaving it global?" The Person A, maybe they'll just say, "Oh well it's bad." Or they'll concoct some scenario in which something actually bad happens because that person said, "We're not going to do that." They could adopt another scenario and says, "I can do that," and so on.

Jimmy:
And it's because one person is talking about the code as it is now in concrete implementations, and the other is talking about all possible modifications to code, and that's ... They're quantifying over all changes to code. And that's a higher level logical statement. Not very high level, but still ... What all of the things in the logical layer have in common is that they involve quantifying over all possible changes to code. So, a logical statement ... I run x = 1, and then X equals 1 at this point, or X is equivalent to the ID of the stdin or stdout, whichever one, or X is a positive number.

Jimmy:
Those are all logical statements, but they all correspond to infinitely many pieces of code, and those encapsulate the facts need to be true at that point for the code after that to run. So even very low level things, like I set X equal to 1 and then X equals 1. That is talking about all possible changes to code, all possible changes that preserve this fact.

Adam:
Interesting. So, do you think that as a profession, software developers ... Do we put too much focus on reviewing the code and not enough on looking at what these intentions are at the logical level?

Jimmy:
Something I write about in my booklet, the "Seven Mistakes that Cause Fragile Code" is that more junior developers will tend to review things like white space and variable names. Or even a little bit better, maybe they'll say, "You should use this function instead of that function." Whereas more senior developers are more likely to tolerate small variations in that. They're more likely to look much harder at what assumptions are being made by a piece of code.

Jimmy:
It's like this thing only works if you call this thing in the exact point of the program. That is not a stable guarantee. "Stable guarantee": so, that's a word, when I hear it, I know I'm talking to someone senior or becoming senior. Or they might say, "This function takes five parameters, and chances are, we're going to add this thing, and then we'll take seven parameters, and we'll take eight parameters." You need to generalize it somehow. Those are the issues more focused on by a senior engineer doing good review.

Jimmy:
And you know that all of those are not about the current code, but possible changes. It's not a stable guarantee that A happened before B, you make it so that B still works, even if A is called after B. That is about a possible change in the program.

Adam:
So, we should focus on the interfaces, is that a way you might summarize that?

Jimmy:
Yeah, I would say good code review focuses very hard on the interfaces.

Adam:
Yeah. So, I read your pamphlet "Seven Mistakes that Make Fragile Code." I think this one you're discussing, you called "Don't Overthink the Interior." So, I guess the idea is that maybe the method body is less important than actual inputs and outputs?

Jimmy:
Yes.

Adam:
And you also had this concept called "Refactoring is Not Boxing." What does that mean?

Jimmy:
So, one of my friends puts it nicely. Refactoring means changing the factor graph of your code. So, what is not refactoring? If you have a mess on your floor, then there is throwing stuff into a box where it stands, or there's actually figuring out what should go where. And some refactoring that I've seen is very much in this line of boxing.

Jimmy:
So, I was actually discussing this the other week on ... Actually no, just like yesterday on Twitter. The common thing that people do, which is not refactoring, but they call refactoring, goes by the name anti-unification. Not many people know the word anti-unification, but a lot of people do it without knowing that term.

Jimmy:
Anti-unification means, say, you have two piece of code that's different in only one place or two places. So, you wrap those into a function, and you make those two place difference into a parameter. That might be a very good operation, but what does that function mean? Can you give it a clear description independent, other than "I just found a common pattern between these two things?"

Jimmy:
If you can't, then there's really is no way to understand what the thing does, other than, "When I fill in these two parameters, it does this. When I fill in those two parameters, it does that." So that is boxing. You made your mess smaller, but the complexity is still there.

Adam:
So, some people would say, "Don't repeat yourself." It's like the golden rule of software creation. So, if I have two things that are repeated, except they vary on this one parameter, that actually I'm improving the code by boxing those up. Do you?

Jimmy:
Yeah, and that's a very shallow metric, because it's very easy to write a program that does exactly that. And I've been talking about how much software design is about these high level facts about, all changes to code, and it just depends on the global intentions, on the high level intentions, and how all of that stuff is not directly derivable from the code because of this many-to-many relationship between design and implementation.

Jimmy:
And so, if you think you can improve the design of code by doing something that trivial to automate, then that's a red flag. So, "Don't repeat yourself," is really like a simplified version of the real lesson, which is to not repeat concepts. And that will cause you to not repeat code.

Jimmy:
But if you have two different concepts or one concept that appears in two ways, but you very cleverly smush these two ways together, then you still need to understand them independently. And you haven't really improved anything, and you've quite possibly made things worse, because now if you have more indirection your code, you'll need to actually jump around between method definitions to figure out what's going on. If you can't actually give a clean description of what this does, it doesn't require you look at these methods.

Adam:
So, would I be able to tell that just by language usage? Is there a word for these two things put together? Is that a valid case? Or is that ... That sounds shallow to me.

Jimmy:
If I have created a function for these two things, can you tell if it's a good thing if there's a simple way to describe it?

Adam:
Yeah.

Jimmy:
It's not a foolproof indicator, but I think it's a very strong one, in that English has already evolved to try to hide details from things, and we already have a strong intuition about it. So why not reuse it? And I think eventually you should learn how to talk logically about, "Can I give a compact, but precise, description of what this thing does," but using some high level concepts, each of which have their own formal definitions, but I don't have to care about them right now."

Jimmy:
That's the real thing that goes on when you try to formally reason about software, and I think we should turn into our intuition as well. But when you're bootstrapping that process, you can approximate it, just by using the machinery you already have in your brain for using English to abstract details.

Adam:
Interesting. So, actually whether there's a good name for it can be a great guide?

Jimmy:
Yeah. In my review of "The Philosophy of Software Design" by John Ousterhout, I think I highlighted something quite close to that as the best line of the book. It as a line like, "If you're having trouble naming a concept, then maybe the concept's not well defined."

Adam:
Yeah, that's interesting. I mean, the very short version of some domain-driven design course I took a long, long time ago, it was like find all your nouns, and those are the important things. I guess it seemed trivial, but I guess there's some deep truth there.

Jimmy:
Yeah. So, when I took my first computer science class in high school, the teacher had us do something like that, and I thought it was silly, and I ignored it for many years. And now I'm coming back to something close to that, except you're trying to think a little bit harder about "What is your concept," rather than just "What are the nouns?" So, some of your concepts might be verbs, and some of them might be groups of nouns, and some of the nouns might be mere details, but it has a lot in common with that very simple trivial process.

Adam:
Interesting. Yeah. I mean, I guess ultimately there is some intention, as you were saying, some specification that we're working towards. And that is going to be expressed in English, I guess, or some natural language. So, there's some correspondence there.

Jimmy:
Well, I like to think of them as being expressed in formal logic, but you should be able to translate between English and logic.

Adam:
Yeah. Actually, I want to get back to that topic at some point, to ask you about formal proofs. But you know, we were talking about refactoring, boxing things, unboxing things, and I think it's a great vein. You also have this point about getting stuck in an old design, and it seems somewhat related. Could you expand on that point or explain it?

Jimmy:
A very common failure mode of junior engineers is: you're given this one piece of the system and told to implement it, but then the pieces that you're using don't really support what they're doing very well. So they have to hack around it. And in a coaching call just yesterday, someone was talking about how he's writing a trading system, and using some library that wanted information about a trade in a string. And he's like, "Oh, I guess I need to do everything in strings now." And you know, I rolled with it, but for a more experienced person, what I'd tell them want to tell them is, "You should go to whoever made that library and tell them it's stupid and get them to change it."

Adam:
That's funny.

Jimmy:
Yeah. So being stuck in the rut of "Here's the APIs I'm using, here's libraries I'm given, I'm just going to do things the way they do, to do my problem, even it's very awkward," as opposed to, "I'm going to understand how the parts around me work, and then wherever it's not supporting what I want to do, see if I can find a better way that both supports what it already does what I want to do better, so that my code is cleaner, so the existing code is cleaner, or not more ugly, and then go out into other parts of the system and change that. Talk to people working with those, get the dependencies changed."

Jimmy:
It's usually not so easy, but part of going from being a junior to a senior is getting more comfortable doing that. Your responsibility goes from your piece to also understanding the pieces around it and getting pushback on designs, adjacent parts of the system.

Adam:
Another statement that I have written down here from you is, "Code quality is not about code." And so what does that mean? It's definitely catchy.

Jimmy:
Yeah. So, it's what I've been saying all along about the logical layer, in that a common thing that kind of goes along with things like the "Don't repeat yourself" principle is to attach yourself to very superficial measures of code complexity. And that can lead you to all kinds of things that don't change the logic of the program, but seem to make the code shorter. And usually that happens by taking a lot of things that used to be explicit and making them implicit.

Jimmy:
So, one of the worst cases I've seen of this was, actually, I think it was on the official MongoDB blog, where they showed one example using some kind of schema database. Like, look, you have this code that defines a schema, then this code that's maybe generated....there's still this code that accesses it and writes to it. Versus, like you have this MongoDB thing and just put stuff into our giant hash table, therefore it's shorter and has less technical debt and blah, blah, blah. And no, it's, and it's simpler ... And no, it's not simpler. It doesn't have less technical debt. You've just taken this explicit schema and made it implicit in the way that you're using this giant MongoDB hash table thing.

Adam:
So I totally agree, by the way. Yeah. Like we've thrown out our schema. Now it's just implied knowledge that has to travel around, and it's not enforced-

Jimmy:
But by a very superficial read, the code has become simpler. But the logic has not changed.

Adam:
So, we have to understand the code that's not there. That sounds very difficult.

Jimmy:
I actually like the way you're putting it. And in fact, there is a concept in verification, which I think I want to talk more about a future blog post called "ghost code," which is ... And it's kind of like what I was saying earlier when you're turning a screw on the motorcycle, but really you're, in fact, modifying this abstract concept of a drive train. Ghost code is basically: you have some extra variables that exist in your head or in your proof system, but not the actual executable code, which is like a variable for the drive train. And then when you turn the screw, you'll find the code that says "Turn the screw" and you put extra code next to it saying "And correspondingly modify the drive train in this way." So, you say the code that's not there, but that's actually a very serious concept studied and programming language theory.

Adam:
Yeah. So in the Mongo case, there's all this schema stuff that just implied. I can think of other cases, like type system. So, I was working on some code, and it determined if an element is greater or less than an under value, and it does this with an integer. So it returns, like if they're equal, it returns 0. If the first one's greater than the other one, it returns 1. And then if it's the reverse, it's negative 1, right?

Adam:
And so, there's something implied schema there, right? Where although it's an int, the only valid states are these three. Probably a problem, to be honest, that it's represented using an integer.

Jimmy:
Yes. And I have never been able to remember which is which.

Adam:
Yeah, totally, right? There's a ghost schema there as well, I guess, right?

Jimmy:
Yeah. So there's an extra constraint in the output data. From a verification perspective, you'd say it's not only returning a 1, it's also returning a proof that if this thing returns 1, then the left thing is greater than the right thing. And so, in your code, when you write ... When you're checking it for the 1, you have to make this corresponding logical transformation that because it entered the branch where this returning value is 1...to make the extra logical inference that because it returned 1, therefore X is greater than Y.

Adam:
Yeah. And it sounds, to me, like this logical level is actually the proof level that we aren't writing. Is that true?

Jimmy:
Yeah, I'd say so. And in fact, a lot of stuff I'm saying, you go from this returns a 1 to this extra step of because it returns a 1, I apply some fact about this function, therefore X is greater than Y. If you were to do a formal proof in a proof system like Coq or Isabelle or Lean, that would be an explicit step that you'd have to type in.

Jimmy:
Something that I've been interested in a while is trying to see if there is a way where I can teach ordinary programmers to do this kind of thing without having to learn Coq, because you either learn Coq or Isabelle — actually, I haven't used Isabelle, it might be better; I think there's some reason to believe it. If you learn Coq, then you have to do all this very persnickety stuff that has absolutely no relevance to normal programming, and for all these hours and hours and hours, and...If you've ever gotten really sucked into a programming problem, and then find yourself addicted, and then it's 6:00 AM, Coq is three times better at doing that to you. It just makes hours disappear, because you get this constant feedback from the system. But it is a time sink.

Jimmy:
So, you're doing all these things with no relevance to normal programming, but then you also ... You get this intuition of like, "Oh, no, from the fact that it's returned 1, there's an extra step to making the inference that X is greater than Y." That's an insight that I want to be able to give to people without having them have to spend all these hours in Coq. And I haven't figured out how to do that yet other than by having these conversations, and giving a lot of examples.

Adam:
I found this quote on your website. It says, "Since the age of 18, I've been obsessed with the problem of automating software maintenance." So, why is software maintenance important to you?

Jimmy:
Software maintenance is important because the world runs on software, and changing the world means changing the software.

Adam:
So, do you think that maintenance is overly hard at this juncture?

Jimmy:
Yeah. And... Most obvious place this happens is the Y2K problem. So you have who knows how many hundreds of billions being spent to change to, two-digit thing to four-digit thing or some generalized version. And that's a time when all the companies of the world had the same problem at once, but things like that happen all the time. Like earlier I mentioned the string thing with that public company. There's hundreds of hours per team to change their string representation. That kind of thing happens all the time, but it's usually not a case the entire world has the same problem the same time.

Adam:
So, how can we improve software maintenance?

Jimmy:
So, my life goal is to improve software maintenance by a factor of 100. And I think we can get the first 10x just by exactly what we're doing right now, telling developers to think about the logic of the code, to track what assumptions each piece of code is making, and then limit them.

Jimmy:
In the case of the string representation, is they could have very simply done well known super plus things to wrap the string in an abstract interface and this problem...when they wanted to change their representation, it would have taken five minutes instead of ten thousand hours. So averaging that with cases where it's not so dramatic, you maybe can get 10x. The other 10x is going to have to come from tools. So if we do have the ability to try to look at a code and infer what its design is even though there could be more than one answer, then we have to be able to make tools to do the same thing. And then from there be able to make tools that modify the design.

Adam:
So, do these tools exist? Or, I guess, your life's goal is to build them?

Jimmy:
To a limited extent, they do. One very exciting project I've been a little privy to is from Semantic Designs, which is a company in Texas that I spent a few months at a few years ago. They've been doing some work with a large chemical company to try to do design recovery of chemical plants.

Jimmy:
So, they have about a thousand chemical plants around the world. And they're running this very obsolete language. And so, they have all these controllers like, this sensor reads this, then flip the switch, blah, blah, blah. So, they had a high level macro language, which changed over time, which they wrote these in. Then they generated some low level code. Then they changed the low level code, which means the high level code is no longer relevant.

Jimmy:
Now it's 50 years later, and the hardware is failing at an increasing rate. So, we've got to switch these over before that happens. And now you have this giant gob of goo, and you need to figure out what it does.

Jimmy:
So, because you know that this low level code has been generated from these high level libraries, they change over time, and they're modified, you can do smart things to try to recover ... To try to do this clever decompilation, discover what high level operation things correspond to, and then translate it into our little form, then from there translate to Python. And they've been doing that. Then they're already ported ... I don't know what the number is. By now, it's probably over a hundred chemical plants.

Adam:
So, you work, you have a tool that ... I remember the specific line that said, "Programs that write programs that write programs." It took me a little bit to understand this, actually. You've built a tool that allows for cross language refactorings? Is that-

Jimmy:
Okay, so, you're mixing up several different things there. So, my research interests ... So, I was talking about tools like the thing used for the chemical plants. And my goal was to make software maintenance better. Part of that is making tools like that better and more common, tools like the chemical plant thingy.

Jimmy:
But there already are a lot of people in software engineering research who do know how to build those tools. But a reason why those aren't very common is that they're all very specific and hard to build. So for instance, if you had a program that did all its calculations in Fahrenheit, and you wanted to change the code to use Celsius internally, rather than Fahrenheit, that might be a very tough change. And even if you could pay a million dollars to an expert, or a hundred thousand dollars to an expert, and get them to write a tool that does it, you probably wouldn't.

Jimmy:
Even if there's a hundred people, or a thousand people also in the world doing the same thing, you couldn't find them. And their tool would probably only be for one language, and you'd probably have to rewrite it even from Scala to Java, even for similar language. So I think that by reducing the cost and making these tools, we can see a much larger number of these advanced whole program refactoring things built. So that is my goal in research.

Jimmy:
And so, I'm interested in finding better ways to build these kinds of tools, to make them more general and to automate parts of building them. And since I'm interested in automating parts of building these tools, that has the cheeky thing "programs that write programs that write programs."

Adam:
And this is...Cubix is your framework?

Jimmy:
Yeah. So, Cubix is, it's a stretch, to put Cubix literally under the tagline "programs that write programs or write programs." But the paper I just submitted to POPL is on something called Mandate, which is much more fitting with that moniker.

Adam:
Cool. So what is the end goal here in terms of software maintenance?

Jimmy:
The big dream is that if you wanted to write your thing, which takes a program that does stuff with Fahrenheit, changes its internal code to do stuff in Celsius, then I should be able to describe it at a pretty high level, and then get that tool, and then I should be able to only change a few lines, and they get the tool for Java, and then get the tool for C and they get the tool for Rust.

Adam:
That makes sense. So, the reason it's "programs that write programs the write programs" is the Top level program that you're working on could be fed a program about how to change program in a loose sense. Change Fahrenheit to Celsius, and then it will generate programs for a myriad of languages that can do that source transformation to those languages.

Adam:
So, I understand you're doing a training course on some of the principles we've gone over today.

Jimmy:
Yes. My have passion and livelihood is on teaching a lot of the things I talked with Adam earlier about how to see the logic of your programs and write things in ways that are easy to read and also need less time debugging, and they'll be easier to add features to, except in a much more concrete way with a lot more examples, a lot of practice with very good feedback on the practice. That's my advanced software design web course. The next run is starting on August 23rd, I'll be taking 15 people.

Adam:
One other question I wanted to ask you. I've been looking into you, preparing for this interview. So, you currently are at MIT, working on your Ph. D., to my understanding?

Jimmy:
Yes.

Adam:
But according to the Internet, Peter Thiel paid you to not go to university.

Jimmy:
Yep.

Adam:
So, could you describe the program you were a part of and how it was?

Jimmy:
Yes. So, I was part of the Thiel...."20 Under 20" Thiel Fellowship. I entered in 2012, and got a hundred thousand dollars over two years. So, at the time I had been a junior at Carnegie Mellon, but I did not have a senior year. I just left. Fortunately, I still graduated with two degrees, because I had worked really hard in those three years. I was planning to do a master's, but I dropped out.

Jimmy:
So, for two years, I worked ... For one year and a half, I worked on a company, trying to write a program that fixed bugs in other programs, and I talked more about that elsewhere, as such, in my interview with Future of Coding, but that wound up not going so well. Both because the technology was still in its infancy, because I was not an expert in the technology at the time, I was learning as I went, and also because I learned the hard way that a good product idea is not a good business idea. Whoever tells you that good business ideas are a dime a dozen, it's totally wrong.

Adam:
Oh, interesting. How so?

Jimmy:
Good product ideas are dime a dozen. Good product ideas, that...where you can find the audience, where you can convince the buyer the value, that are protected against competition, how we can market it, where you can hire people to work on it, where you can sell add-ons and grow. When you add in all those other factors needed to turn a product into business idea, they become very few and far between.

Adam:
That's interesting because, you've kind of carved out a distinction for me that I haven't thought of, which is that people's business ideas are often just, "Hey, I whatever. I built a better mouse trap or something," but that's not really a business idea. That's just a product. So, did you find the program ... Your two years useful? Was it a good experience?

Jimmy:
Oh yeah. Very. So, it certainly laid the foundations for what I'm doing now, and that is was from a process building this company that I gained the insight into realizing I need to go one middle level up into making these kinds of tools easier to build. But also, managed to become very connected to the industrial side of things, which is why I'm able to do what I do now. And my teachings about software design are very much bridging practice and theory.

Adam:
Yeah. You don't see many people who are working in academia also doing kind of a nuts and bolts advanced software design course.

Jimmy:
You don't see many people anywhere doing in advanced software design courses. In fact, there are zero books written today that I've found — and I've looked — that I would call an advanced software design book. There's a large number of beginner books in software design, and a small handful of intermediate-level ones, but none that I would call advanced. That's because no one knows how to teach it, and my work has changed that.

Adam:
That's interesting. You had this post called ... On your blog called ... Oh, it was about Benjamin Franklin method for programming books.

Jimmy:
Yes.

Adam:
Could you explain that?

Jimmy:
Yeah. It's the whole genre of programming books, as distinguished from computer science books. Usually, they are like "Postgres 7" or "Writing Games on iOS" or "Intro to Rails," and a core feature of them is they have a lot of code snippets. They're walking you through these examples of what to do with these code snippets. Do you type them in? Do you download them from the book's website? How do you learn from these code snippets?

Jimmy:
So, from multiple sources, I learned about the Benjamin Franklin method, which is... he wrote in his autobiography, this method he had for being better at writing without a teacher. He found these articles that he liked that were totally better than his. And he didn't understand how he could produce articles like them. So, he would take notes in these articles, compressed them into like a few sentences. A few minutes later, he would come back, look at his descriptions and try to reproduce an article like it from the descriptions.

Jimmy:
And then that would give him a laser eye to focus on all the little tricks, the rhetorical flares that the professional writers did that made their article better than his. And so he learned. And that's what's a lot of artists in other fields do. So, visual artists might try to copy a painting, and in doing so will discover all the techniques used by the original painter.

Jimmy:
And so, I realized that a modification in that idea can be applied very easily to reading programming books. The most faithful version would be to try to write a description of what this program does that you have the snippet for, come back a few days later, and then try to reproduce it. And you can do that. The feedback cycle is kind of slow, and you might not actually want to care about reproducing the exact same program. It's not as important to wait a few days, forget everything you knew before.

Jimmy:
You can get a very cheap version of this, which is very effective, just by looking at the page, flipping over, looking away, and then trying to write the program. And even though that might only add a few seconds versus copying from the page, in those few seconds, you'll forget so much about the syntax, unless you can compress it into a better form. And that gives you more learning.

Adam:
Yeah, I like this method. I haven't tried it, but I think that we've read the same book, because I saw a reference to Anders Ericcsson, I think, the deliberate practice guy.

Jimmy:
Yeah. So, my summer in Texas, when I was at Semantic Designs. I had a car for the first time since I left California, so I was listening to a lot of audio books. And so within a short period of time, I read both Anders Ericsson's "Peak," and also Josh Waitzkin's "The Art of Learning," both incredible books. And also "Mindsight" by Daniel Siegel. And I think all three of those, or at least two of the three, all mentioned this technique.

Adam:
Yeah, I love it. Well, so I haven't tried it, but when I saw you reference it, I was like, "That totally makes sense." So Anders Ericsson, right, is the deliberate practice guy?

Jimmy:
Yes.

Adam:
We touched on this a little before. He spent a lot of time studying how people become experts. So, he has a number of techniques in his book for working in a field that's less defined, right? Because he studied chess, right? And in chess, there's very defined progression, and I think violin as well.

Adam:
But, as you were saying, okay, here we are, software development. There's no advanced books, right? Everything's surface level, how to learn X. So, another technique that he had that I liked was ... He called it the TOPGUN method from the TOPGUN, the school. Do you remember this technique?

Jimmy:
I remember him talking about TOPGUN Academy. I don't remember the generalized version.

Adam:
So, the idea was ... I think it was the Navy pilots were getting basically schooled by the Russians at some point. And if I'm murder the story, you know, write me a letter, listeners. But this is my approximation. So, they knew they were failing at whatever, dog fighting, right? But they didn't know how to get better. So, they started this TOPGUN school where they just fought against each other. So, they didn't have a teacher, but they managed to, as a group, teach themselves how to be better by competing against each other.

Adam:
I've often thought about this in some ways to do with how a team could get better at software design by just critiquing each other. That was basically their approach, except much more adversarial, like they're pretending to shoot each other down. But they managed to learn by feedback.

Jimmy:
I think it kind of boils down to a bigger thing, which is very often the way to get better is to recognize the difference between practice and performance, between just doing something, versus trying to get better. And so, all too often, in these kinds of fields, the way to get ahead is simply to try to get ahead. Like other people are just programming but not really deliberately focusing on what they did and what they could be doing better, then that's all you need.

Jimmy:
I want to share one story that's that's similar to the TOPGUN academy, but more programming related. An old story of programming folklore, but not super well known. And that is a story of the IBM black team. I think this is sixties, seventies, IBM had many teams with their testers. So, they decided just to take the top 10 percent of testers and put them on their own team.

Jimmy:
And they thought, "Okay, maybe this team will be 25 percent better than the other testing teams." And that's not what happened. They actually became 10x better than the other testing teams, in that now you have a whole group of people who are more motivated, and they formed their personalities around being good at breaking other people's software, and they fed off each other. So, they would learn all these things, share them with each other at lunch, and pretty soon they took to dressing all black, and then wearing mustaches that they would twirl as they try to break someone's piece of software.

Adam:
That's awesome. So, you mentioned that you think that practice is different than performance. How so?

Jimmy:
This is something which is talked about, by K. Anders Ericsson a lot. So, you know something that you're good at, like maybe running, or sailing, or playing an instrument, or certain kinds of programming. And you get into this zone, and it feels pleasurable, and it's really comfortable, pleased with how much stuff you can do. But that is performance. That is not practice.

Jimmy:
You know you're growing because you're pushing yourself to your limits, because you're stretching your brain till it's uncomfortable.

Adam:
When we talk about software design, I think a standard idea is you just have to build a bunch of software, right? If you got enough experience under your belt, then you'll be good at software design.

Jimmy:
Yeah.

Adam:
What do you think of that?

Jimmy:
If you're writing software, you can't help but occasionally doing things you've never done before, but there's nothing deliberate about that. You've heard the line forever, "Practice makes perfect." Maybe you've heard the more modern line, "Perfect practice makes perfect." Practice does not make perfect. Perfect practice makes perfect.

Jimmy:
We can find issue with that line as well, but it's closer to the truth. And so, similar in software. Just doing it, does not necessarily make you better if you're doing the same things over and over. You need to do it better. Do things you haven't done before.

Adam:
And what does practice look like? If I want to know how to be better at designing software systems, how do I practice?

Jimmy:
So it's really a few counterintuitive things. One is I teach people about ways that they can lock down their design, not only making valid states on representable, but also make it so valid states can only be represented in one way. No junk, no fluff.

Jimmy:
So, you learn all these techniques, it becomes much harder to write the wrong program, easier to write the right program. And then they go and take it too far, and try to fill up all these kind of really elaborate type stems that force you to do things a very certain way or otherwise everything will break. They write 20 thousand different types. And they're taking it too far? Yeah, totally. And I say go for it. Everyone needs to go through that phase of taking it too far before they learn how to do it naturally.

Adam:
Interesting. So, when I learn something, I should take that thing, and really see how far I can stretch that piece of that concept.

Jimmy:
Not necessarily, but you have to run towards things that are harder than what you've done before, not run away from them. That doesn't always mean you check that code into production, but you have to experience it.

Adam:
Yeah. I think that I could learn a lot about how databases work by building my own toy database system, but I should never actually use it. It's more of a learning experience.

Jimmy:
And that's the case, like....even if there are specific tasks where you can write ... Where you think you can write something that's better than Postgres for one very specific purpose, and maybe your program can be better if you do that; it's very unlikely you will encounter a case where doing that is actually a trade-off. And so you might never learn....you might never get a chance to exercise that skill in the wild.

Jimmy:
That's what I'm saying. The practice, not the performance. So, you can't always lean on saying, "If I do this enough, then I'll get better at every point I want to get better at." You have to aside time, saying, I'm going to do this thing. This is practice. This is not going to produce an artifact I can use. It's just to help me get better. And I'm okay with that."

Adam:
So, would you advocate building a web scraper over and over and over again? Like different ways, different languages, abusing type systems, having no type systems, doing it lazily, doing it like ... I dunno. Is that a valid way to practice?

Jimmy:
You will certainly learn more doing that than if you just wrote different web scrapers the same way over and over.

Adam:
I'm just, I'm grasping at how we can practice. I feel like, okay, if you're a baseball player, it's pretty clear the difference between practice and a game. But I don't know how clear it is as a software developer.

Jimmy:
Yeah, I don't either. But the beauty is that because there's such a weak concept of practice, that means you do anything...you find any of these things that are questionable practice, even if it's questionable how good they are...you're still doing better than everyone else.

Adam:
Because nobody else practices. They're all just playing the games. So any practice-

Jimmy:
Yes. And this is why I say that while there are a lot of good programmers alive today, a lot of master programmers, but I don't think anyone alive today deserves the title "Grandmaster Programmer."

Adam:
Interesting. Yeah. I wish there was, right? Like if you wanted to become a chess prodigy, I think there's a certain ladder that you need to follow, right? At some point, it involves working with a grand master and spending all of your waking hours doing things. But they know what those things are. I don't feel we do know what those things are. That's not a question. I'm just thinking out loud here. You got my gears turning.

Adam:
Well, I think that I went through basically every question I had for you at this juncture. So, this has been a lot of fun. So, thank you so much for joining me, Jimmy. It's been great.

Jimmy:
You too.

Adam:
That was the interview. I think Jimmy's concept of the logical level in software is one I'm going to keep thinking about for quite awhile. Programs that write programs that write programs. Jimmy is coming at things from a whole different level.

Adam:
If you're listening to this now, say hello. I'd love to hear what you thought of the episode. I know I could have asked more questions about program synthesis, but I didn't want to miss Jimmy's insights on software design.

Adam:
Thanks to everyone in the Slack channel as always, and thanks to those leaving comments on the website. Eli and Chris Buena in particular. The website comments go into a moderation queue if they have a link, but should otherwise appear right away, and if I miss some link that should have been in the show notes for an episode, dropping it via a blog comment is a great idea. Timmy did this for the Cory Doctorow episode, and that was great. Tim has been requesting some closure episodes, so I think I should probably get on that.

Adam:
Until next time. Thank you so much for listening.

Monday, July 15, 2019

The Best Refactoring You've Never Heard Of


(video and transcript of my Compose 2019 talk, given June 25th, 2019.)




Transcript



Hello everyone. I'm so excited to be here at Compose, along so many enthusiastic and some very advanced functional programmers. I live a dual life. At day, I teach computers how to think about code more deeply and, at night, I teach people how to think about code more deeply. So, this is the talk I've been really excited about the for last year; this is hands down the coolest thing I learned in the year of 2018. I was just reading this paper about programming language semantics and it was like, "Oh, these two things look completely different! Here's how they're the same, you do this." I was like: wait, what was that? What?

It explains so many changes that I see people like myself already do. This got all of one slide in my web course I teach, but now I'll get a chance to really explain why it's so cool

You are all here to learn.

Learning is about mental representations.

Mental representations are about abstractions.

And abstractions are about equivalences.

Let's look at another equivalence: the equivalence between recursion and iteration.


Many moons ago, you may have been asked to write a recursive function that prints everything in a list, and you write code like so. And then someone else comes up and says, "Oh, I know. I have another way of doing it. Another approach: an iterative version." And now, you know that these are equivalent and it's not really two ideas but two faces of the same approach. You can turn one into the other.


The trade-offs: one is shorter, the other is more memory efficient. You can automatically convert one to the other, get the benefits of both. And because you see this, these two ideas have fused in your head into one and programming becomes that much easier.



So, equivalences help you see things at a deeper level. But this is not just any random equivalence that I'm bringing up because, imagine, for a second, you're asked to not print everything in a list, but to print everything in a tree. This is no longer a tail recursive and you can puzzle out how you wouldn't turn this into a loop. You know it has to be possible, using a stack, but: not so cookie-cutter.

Well, what I'd like you to do is now forget anything you knew before about how you would make this iterative, because you don't need that. It's actually a special case of defunctionalization which is what I'm going to be teaching you today.



We're going to actually be looking at four different equivalences; variants A and B. Every case, they are the same. There's a way of getting it from one to the other. Perhaps you've already done one of these changes yourself.

But then further, we'll see that actually all four of these techniques are really all instances of one general transformation, defunctionalization, and its inverse, refunctionalization.




Defunctionalization is about turning functions into not-functions, into data structures. And it starts with the simplest example of: a filter. So, here's a filter function. It's nothing special. It takes in a list of integers and it filters according to some predicate and gives you a smaller list. This is the code you might have seen a thousand times before. But, in the way that filters actually show up in applications, there's something missing here.



This is Jira. These are its filters. When you're selecting a checkbox from there, you're not picking up a lambda and sending it over to the server. Filtering for instances in AWS, filtering for Chrome extensions — you are never clicking on a lambda and then sending that. You're clicking on something else. Why? Because, barring magic, you cannot send a function.


And same goes for desktop apps. Here's a desktop task manager, desktop Jira. Here are a couple other things. All of these have filters that you can save in your preferences so you only view a subset of things.

In every case, you are selecting checkboxes that are not lambdas. So, you probably have an idea of how we implement this. But how does this correspond to the higher-order filter function we had a slide ago? What we're going to do is go from this filter function that takes a predicate as a parameter to something that takes a filter data type as a parameter.



So, what is this filter data type? We're going to turn the functions into the data type by example. This is defunctionalization by example.


Our first step is: we gather all invocations of the filter function, and there's only a finite number of lambdas you can pass in when we're looking at the whole program. Okay? Now, we create an enum with one case for each function. Three possible functions. Now, we need a way to use them so we go to "filter," we replace all uses of that function with the data type. Now we have to change the filter definition accordingly so, the function type becomes a data structure, and, where before we called the function directly, now we apply a data structure.


[An animation on this slide shows the old filter function and its invocation morphing into the use of Filter and apply.]

How does that apply work? We grab the code from the places where we use filter. Each case of the data type corresponds to a use of this function. And we put that into our apply. So this was turning all uses of this higher-order function, finding all arguments passed for this function parameter, and we turn it into a data type with those uses as the code for applying it.


Now, there's something a little unsatisfying about this story. So, I have a LessThanThree built in into my filter language. What if I want LessThanFour? LessThanFive? Do I need an infinite-case data structure? The answer is yes — if you manage to call filter infinitely-many times in your source code, like an infinite-line program. If you don't, then it collapses down. More likely than having infinitely-many calls of case, infinitely-many calls of filter, each with a different integer, you will have one or two calls with a free variable for that other integer.


So, the full story of defunctionalization is that those free variables get stored into your data structure. So, it's actually just LessThan and we take that "y" and we stuff it in the data structure. And now, the apply will get updated to use that.



And moreover, these free variables can even be functions. So, here is a lambda that takes in two other functions and gives you the "And" of them. So those functions are free variables, and they even have the same type as the filter itself. Which means, when we turn this into a data type, we get a recursive data type. So this data type is a language of filters, recursive filters.

And I didn't design this data type, it just appeared by moving around the code that was already there. And thus, we get a huge language of filters of "and"'s and "or"'s, purely mechanically.


So, now we've seen enough of defunctionalization to start talking about its general trade-offs. The higher-order version, the refunctionalized style or direct style, is open.



What that means is: imagine that I'm using filter a few times and wanted to call with a different function. All I need to do is call filter again with some other lambda. So the higher order version is open: it's easy to add new variants.


Whereas the defunctionalized version is closed. If I wanted to add one of these new operations, I need to go into filter.hs, into my definition of filter, add this to the data type, add it to apply and only then can I use a new case.



But, what I get from that is that it's serializable. It is possible to, very trivially, send this data structure over the wire, to save to disk, and in doing so, I get the checkboxes that you saw in the actual apps.




Let's look at another example. In that one, you could have looked at both the before and after code and seen how they correspond, even without this talk. Let's look at an example where that's a lot harder.

So, we're back to printing the tree. So, here's my tree data type, it's nothing special, has a left and a right. We'll just do an inorder traversal, so traverse like this [animation of tree traversal plays] and I'd like you to take a few seconds and try to think to yourself: how would you go about turning this code into the iterative version? And, if that comes to you too easily, then think about, what if you wanted to add parentheses before and after these calls? Then, how do you turn this into the iterative version? Take a few seconds.


So, the solution, of course, is to use defunctionalization. We'll see the actual code, the iterative version, in a few slides. So, we're going to use defunctionalization but...what functions are there in this to defunctionalize?


This function returns. It finishes at some point. What happens when you call return? It runs some more code, it runs the rest of the program and, in this case, the rest is either to print the right half of a tree or to be done, finish the entire tree. So, when you run return, it does it like a function. From the perspective of the first printTree, returning from it runs the thing to print the node and the right half of the tree, like a function.



[On this slide, an animation shows the process of pulling the latter half of the body of printTree into a callback.]

So, a lot of you in the room already know about continuations, continuation-passing style, exactly what I was just talking about. So, the first step is to do CPS conversion on this code. So, we're going to pass in the thing to do after the code is finished and wherever we call print tree, we say do this thing after. We wrap that into a function, and, after the whole thing is done — we printed both halves of this tree — then we say "Do the thing after the entire function."


Now, we have some functions to defunctionalize. So what we're going to do: we're going to defunctionalize the continuation. Let's do that.



So, there are two possible continuations, two functions to put into our data structure. One is this thing to print a node and the right half of a tree but the other is that, when all is said and done, we're going to finish. So, let's grab these things — oh yes, and when you print the right half of a tree you have to remember what tree you're printing and the thing to do after you've printed the sub-tree.

So, let's rip those out and we have our continuation data type. And it's either done or it prints a tree and then goes on. And, in this case, if you're using a language like Java, then this data type can be encoded elegantly as: we have a tree and a continuation, except that your next continuation might actually be "Done."

Okay, so what is this thing? You have a value and a next pointer. This is a linked list. Not only that but the way we're going to use it is as a stack. We know we want a stack at the end so this is kind of exciting.




[On this slide, an animation shows ripping out the body of the two callbacks to form the apply function on the next slide.]




Okay, so let's carry out the defunctionalization, rip out the code; those two cases become my apply. Now, where before, I actually used the action, instead I'm going to pass in a data structure representing the thing to do. And here's a stack, a linked list — a continuation of the rest of the tree to print.


[On this slide, an animation plays showing replacing the call to apply with its body.]

So, it evolves and it kind of looks like a stack but this doesn't look like it's really taking us anywhere. We still have this big recursive function. How is it getting us closer? Well, we're just going to move some things around. Let's do some inlining. Apply is called once; let's move it over. Now this is interesting: printTree is always the last thing called in this function, whenever it's called. This is now a tail-recursive function.


And that means we do the normal trick for making a tail-recursive function iterative. Wrap it in a while(true). Every place before where we would have done a recursive call, we just update the parameters and go on to the next run of the loop.



And there's actually something interesting about our use of this continuation type. Where we say k = new Kont(tree, k), that is a stack push. k is only used once so we can make this mutable. Where it says k = k.next, that is a stack pop. And so, this is exactly the iterative solution to this problem. And, in fact, if we did have a more complicated recursion, say, if you want us to use something after you print the right half of a tree, like print a close-parenthesis, the iterative solution could get a lot more complicated. But, if you do the exact same process, you get the code for doing that. No thinking required.


So, let's review. We did a four step process to go from the recursive to the iterative version. First, we made it CPS so these functions appeared. And then we defunctionalized those functions. So now we have these two mutually recursive functions passing around this continuation object. We inlined one to the other. Now it's tail recursive; it becomes a loop. But, of course, the inlining is just kind of moving things around, so we can do the tail-recursion elimination. The big insight that made this all possible, the real workhorse of this transformation, was to defunctionalize the continuation!





So, we've seen two pretty different uses of defunctionalization. Let's take a few moments to go into the history of this.


Defunctionalization was invented as a compilers technique by John Reynolds, a man well ahead of his time. I have a language like ML, I want to turn it into C, I need to get rid of all these higher-order functions — let's turn them into data types! There are some compilers that do do things this way. But the problem is, you need to see the whole program to know what this data type should be. So, if you're doing whole program optimization, you can do this, and some of them do. But most people like separate compilation — you know, compile one file at a time — so most compilers don't do this.

But then, around 30 years later, this guy Danvy, he started writing a bunch of papers and he found that this technique is actually also useful as a program manipulation technique. Not for the compiler but for people to turn a program that looks one way into something that looks very different, but does the same thing. So, he started writing a ton of papers using this technique with his colleagues. And he also showed that going the other way, refunctionalization, was also useful.

So, he'd been doing it before but in this paper, "Refunctionalization at Work," he had some cool examples where he took these very old state machine algorithms, said: this state machine, those things look a lot like defunctionalized continuations, let me refunctionalize them. Let me turn them into direct style. And so, he had these parsing algorithms that he then turned into using these nested recursions that were really beautiful, but I probably could have never come up with this fancy nested recursion unless you derive it like this.

So, if you want to see umpteen more examples, usually in programming language semantics, just read any other paper by Danvy since 2001.


For the next few examples, we're going to be talking about turning the rest of your computation into a data structure, where the rest of the computation actually goes into the operating system and outside of your program. So, for seeing how the continuations act in operating systems, I recommend this paper by Oleg and my friend Ken.


So, let's go into something that probably a lot of people in this room actually do do on a day to day basis. An example of defunctionalization that many of you already do without realizing it: web actions.
So, here's my favorite website, Hacker News. Oooh, "AI habitat," that sounds kind of interesting. And I want to leave a comment but I'm not logged in.


When Hacker News first came out, around 2007, it had this cool feature that was unheard of at that time and that feature was this. I'm logged out. When I start typing in a comment to leave, it takes me to login page. But then, when I finished logging in, it remembered what I was doing and it posts my comment. Wow. What is this magic?



The way it worked back then is using continuations. To illustrate, I'm going to show, imagine you're writing a command-line version of Hacker News. The code will look pretty different from what you would do for the web, if you do things the normal web way.

You say, all right, give me a comment, oh wait you're not logged in! Let me print out "Enter username and password." Then you print out the result of composing the comment. This is five lines of code, all direct-style, but really there are already continuations going on here. From the perspective of a request to login, the rest of the action is a continuation. And so, in web, when you request login, it actually sends a response across the network to get a new web page. But the same idea is going on that the rest of this action is a continuation.

But, when you do things the normal way, just go to login page, the rest of the action just gets lost. So, for Hacker News, they decided to not lose the continuation.


So, here is the old version of Hacker News, the open-source version. And my login screen, it's actually tracking this function ID. The server knows that this is the hash of a lambda that is a continuation. The rest of my action that I want to do when it completes the login. And so that's how it remembers the comment I was in the middle of posting.


So, continuations. Before, in all my slides, I've had to nest lambdas and everything whenever we did continuations. Does that mean that Hacker News is written like this? This is callback hell. If you've done server-side JavaScript pre-ES6, then it's all too familiar to you. Well, yes and no, but even still, there's something really unsatisfying about this. So we take that continuation and just turn it into an ID, a lambda, there are things you can't do.


So, what if something goes wrong? Oh, I was in the middle of doing continuation "TQE.." — what does that mean? And also, what if you wanted to load-balance Hacker News so that, when you log in, it actually sends you to a different server, that other server doesn't have a map saying "this hash ID is this lambda." You cannot send a function.

So, in order to load balance this, to make this work across machines, you need to do something else. And that something else is, repeat after me: defunctionalize the continuation!




Pause for a second. Let's think about why is doing this kind of interaction harder on the web than direct-style? Well, is it really? I've already explained that direct style is involving continuations behind the scenes. So let's take a closer look at that.



So, when I do a read, it sends a system call to your operating system. Really, the operating system now captures the rest of my program as continuation. It stores this continuation. Then it does its thing and then it invokes the continuation to jump back to my program. So this style of continuations is coroutines.




Client/server is also coroutines. I have a big interaction and I've just split into parts, so: go display the page. That's like a system call. Then I invoke the continuation, jump back into the server; it's one action split across two processes. Coroutines. So that suggests that we can use similar tricks to how the operation system is making your thing direct-style to also make the web version nice code.



So, Hacker News was written in a continuation-based web server where your code is direct-style like on the left but it still runs across multiple web pages like on the right.


[On this slide, an animation runs, transforming the three parts of the code snippet into the three webpages.]

And, of course, when you do normal web programming, you really have done this part by hand. You've broken your action to pieces, one for each continuation, split into parts. Each of those is its own web page. And now, let's talk about defunctionalizing the, help me, continuation!



When you defunctionalize the continuation, it looks like this. That's kind of, well, that looks kind of familiar. So, I have an ID saying which continuation. I have some parameters for the free variables of that continuation, for the things it has to remember and what it needs to do next. And, that looks a lot like things that web programmers do every single day.


And that is, in fact, how modern Hacker News works, where instead of having a function ID, it now has an action name and values, saying what to do next. So, presumably, the modern maintainers of Hacker News have, by hand, defunctionalized these continuations and split everything into parts and ruined all the beautiful, beautiful code that Paul Graham used to brag about.


So, this is the story of why doing interactions in web is kind of nastier than on a command line, in that we, by hand, have to take these actions, break them into pieces, defunctionalize them, give them names and data structures. So, if you're a normal web programmer and you really are trying hard to do things as cleanly as for the command line then, either you figure out how to automate this process, or you throw up your hands and weep and accept fate.


Going to do one more example which is very similar in flavor to this last one. Let's talk about blocking I/O again. So, as before, you do a read, it does a system call then the computer goes, does its thing, then it invokes the rest of your program and resumes. So, a lot of people don't like that. It's slow, things stop. So there are many styles of non-blocking I/O and I'm going to talk about non-blocking I/O via notifications.


In Linux, this is done via SIGIO. That means you tell the operating system, "I want you to read this file and give it to me in this buffer." And then, you keep going and do your thing and a bit later, the operating system says, "Stop! I finished your read, here it is," and it runs some code in your process saying, "You can now use this data." So, now we're going to talk about this non-blocking via notifications I/O system.


Now, of course, there are many ways to do non-blocking I/O but, for pretty much all of them, you have all this extra stuff to worry about and concurrency and so, let's talk about how, really, these kind of I/O notifications are just like the direct style but where you have defunctionalized the continuation.



Let's take our laser eyes back to the read. So, here we've split your system up into three parts: Your program, the actual file.h code, and the operating system. So, you tell the file to read, it does some stuff; it eventually goes to the operating system. The operating system does its thing. Now, it goes back to your file.h, updates some state, saying, "This file has read this many bytes," blah, blah blah. Then it goes through the rest of your program.



So, from the perspective of accessing the disc, all the stuff that happens afterwards is the continuation. If we take only part of this process, like the part with immediate relevance to what happens when you finish reading, that is a delimited continuation.



And so, when we create a notification action, it's really this delimited continuation, this piece of the rest of your program related to when you finish the read, that gets stored somewhere, and then invoked when the read is completed.

So, these notification actions are really a running a delimited continuation. But, not only that, it's running delimited continuation in defunctionalized form.







We've taken a big tour through various problems in programming and seen how one idea can unify all of them and help you see things at a deeper level. So, helping programmers see things at a deeper level, write better code as a result, is what I do. And I'd like to...

[This paragraph was an advertisement for a workshop I gave the following evening at Byte Academy in NYC, which has now passed.  If you liked this talk, I encourage you to check out my web course on advanced software design.]



So, we looked at all these examples, and, often when you're programming, you have an idea and you have so many ways to do it. I want to do actions, async/await, method passing, so many things, can read a book about each of them, so much to learn. And it's really just too much to learn.



In physics, Feynman tells us that you cannot memorize formulas. You can't just go to the book, memorize formula, learn to apply it. There's too many of them. Instead, we need to learn about the relationships between the formulas. Derive them for yourself when they're needed.


And so, we want to do the same in programming. See many different APIs, many concepts, and see how there are fewer deeper ideas behind them. So, instead of seeing a bunch of approaches to a problem, I want you to see a web of, a single design and various ways you manipulate it to exactly the outcome that you want. So these are many transformations. Each of them could be their own talk or article. But today, I've taught you one very important transformation, not to rule them all but to rule a lot of them. And that is to, everyone with me, defunctionalize the continuation!