Rust for Everyone! :: Jane Street

Rust for Everyone!

Will Crichton

Brown University

Rust promises to empower everyone to build reliable software, but its unique features create steep learning curves. In this talk, Will Crichton presents four years of research into evidence-based tools that help both novices and experts work more effectively with Rust’s complexity, including: Ownership visualization: Interactive diagrams showing permissions (read/write/own) that help students understand borrowing better than rule-based explanations; Trait debugging: A graphical tool for navigating complex trait inference trees when compiler errors become unreadably long; and, Program slicing: Using Rust’s type system to automatically highlight only code relevant to specific variables. All tools discussed are open source and available for use.

Transcript

Yes, thank you for the kind introduction. As mentioned, my name is Will Crichton. I just started as an Assistant Professor at Brown University. And so most of the work in this talk was done either during my PhD at Stanford or during the postdoc I just did at Brown.

And this talk is generally about several years of research that I’ve done on making Rust more accessible to novices and experts alike. For the uninitiated, Rust is a relatively new programming language created in 2010, stable in 2015. And according to Rust’s website, it is a language empowering everyone to build reliable and efficient software.

And a way that I like to think about Rust is that it combines systems programming and functional programming. Now what those terms mean is also a little contestable, but I will define them as systems programming means that we want efficient and predictable performance and resource usage like you might find in C or C++. Functional programming means that we want expressive and correct programs through powerful abstractions like you can find in, say, OCaml or Haskell.

So I personally think that Rust is the single most exciting programming language built in the last 15 years, and that is because systems programming is really important. Like a lot of our vital infrastructure, operating systems, compilers and networks, they’re all written in, used to be C or C++. And for a long time, those were really your only options for implementing those kinds of systems.

And even today, I mean C++ remains the dominant language for large systems, but like C++, if you’ve have ever had to deal with it, monstrously complex language. And that complexity has both direct costs, like how hard it is to write memory-safe and thread-safe code at scale, but it also has hidden costs.

Like I feel like what you can tell me later if you think this is consistent with your experience, but certainly from my experience in undergrad and just talking to programmers, I think we tend to mythologize the systems programmer a little bit as this like legendary figure whose massive brain can handle all the insane complexity generated by systems programming languages. And that turns people away from systems programming who don’t think that they fit that mold. And I hate that.

So I appreciate Rust’s emphasis on building a language to empower everyone, and hence the title of the talk, “Rust for Everyone”. But Rust is still a complex language. Like, I said Rust is systems and functional programming like that’s a good thing. But to someone trying to learn Rust, you might read it as Rust is everything you don’t know about systems programming with everything you don’t know about functional programming.

And there’s a lot to learn, right? Systems programming involves numeric types, two’s complement, memory layouts, pointers, undefined behavior, compiler optimizations and more. Functional programming involves higher order functions, algebraic data types, parametric polymorphism, type classes, typestate and more.

And this all isn’t just an intuition either. Like, surveys and studies have both shown that Rust’s complexity is one of its biggest barriers to adoption. And so broadly, I have been researching this question for several years. How can we empower more people to learn and use Rust?

And notably this question is not how can we make Rust easier? I don’t want to make the language less powerful. I don’t want to treat Rust’s users as incapable of becoming experts. And I emphasize both learn and use because I want learners to pick up the basics more quickly. And I also want experts to be more productive in complex code bases.

And the final hidden element here, which you will see as a theme throughout this talk, is I also want to be systematic. Like, for the entire history of computer science, programming languages and dev tools have been designed essentially on intuition about what makes people better programmers. And my goal is to make this as scientific as possible, meaning the incorporation of both human-centered design principles as well as formal theories of cognition.

And I hope that eventually, a science of human-centered programming can make us or can help us make all languages better, not just Rust. So that’s the high level motivation. And in the remainder of the talk, I’m gonna tell you about three different tools that we built to help people learn and use Rust.

By the way, each tool I’m gonna describe has been published as an academic paper that’s either open access or on arXiv. So if you wanna get more details that I don’t have time to talk about, check out the citations in the stream.

All right, so the first project I wanna talk about is a tool we built to visualize one of Rust’s most important features called ownership. Now before I dive into the tool, I’ll give you a little bit of background on both ownership and the related concept of memory safety.

So memory safety means a program which works with memory shouldn’t do bad things to memory. And lemme show you a concrete example of what that looks like. We’ll walk through a C++ program that contains a memory safety violation and we’ll use this as a running example. So it’s worth thinking about deeply here.

So on this first line, if we create a vector with the elements one, two and three, then in memory that looks like we have a variable v on the stack, which contains a pointer to an array allocated on the heap. On the next line, we create a pointer num to an element of v, right? The ampersand operator constructs a pointer and in memory, num on the stack points to the number three on the heap.

Then if we call v.push_back, this appends the number four to the end of the array, but there was no space in the original memory block. So it got deallocated, a new one was created, the elements were copied over and in the process, num was left pointing to now invalid memory. So it’s a little circle with a cross through it.

So then on line four, if we print out the number by dereferencing it, that is a violation of memory safety represented by this red circle. You can actually run this program like on my laptop, if you run g++ vec.cpp, you’ll get a binary, you’ll run it and it’ll print out random bytes, read from the invalid memory.

Okay, now this example is contrived, but memory safety is in general a very big issue for system software. So Rust uses its type system to ensure memory safety without garbage collection. And so for example, let me show you an equivalent Rust program to this C++ one.

I can make a vector, v equals vec, 1, 2, 3, I can create that pointer, num = &v[2], I can add to the vector v.push and I can print out the number. So same semantics, there’s really no effective difference between these programs. But what is different is when we try to compile it, we’ll get an error.

So Rust recognizes that this is a memory safety violation and rejects the program, but what does this error actually mean? So in Rust when we create a pointer like on line two, that’s called a borrow, and a borrow creates a reference. By default, that reference is immutable, meaning you can’t mutate the data behind the reference, is also called a shared reference because you can share many copies of it to different data structures.

And on line three, this dot is syntactic sugar for calling the vec push function with a mutable borrow of v. So a mutable borrow, also called a mutable reference, a unique reference, is something that allows you to mutate this data and you’re only, and it also provides unique access to it. So intuitively, you cannot have a unique reference and a shared reference to the same data at the same time. That’s roughly what this error message says.

And so this concept, ownership, it’s foundational to Rust’s design. It is important that someone learning Rust understands it deeply. But there’s a fair amount of evidence that Rust developers have historically struggled with ownership, that they report it’s one of the most difficult parts of learning and using Rust.

So we sat down to understand this problem, like what makes ownership hard to learn? And after some study, our hypothesis was that one factor is that existing explanations of ownership were giving developers too ad hoc a mental model of this concept.

And to make that concrete, if you sat down to learn Rust, like most people, you would end up reading an online textbook such as “The Rust Programming Language” or just “The Rust Book” for short. This is the canonical textbook provided by the community for people to learn Rust.

And if you open up “The Rust Book” chapter about ownership, it’ll explain ownership in terms of all the various things you can’t do. So here’s some select quotes. You are not allowed to modify something you have an immutable reference to. You are not allowed to have two mutable references to the same data to allow for controlled mutation. If you have a reference to some data, then the compiler will ensure that data will not go out of scope before the reference to the data does.

The rules get sort of more and more complex from here. But the point is we felt that this approach to teaching ownership and really type systems more generally is too ad hoc. You can’t go on a case-by-case basis.

And so we set out to fix that, we wanted to give developers a mental model of ownership that was more constructive, like you could reason about the sort of ownership state of the program line by line and literally see ownership in the way that the compiler does.

So that idea became what we call the permissions model of ownership. I will explain it by walking through this example. This program is just like the vector one you saw earlier, but the push and the print are swapped. So this program is actually safe to execute.

So in this program, we create a vector v like before and this little box on the right tells us that the variable v has gained three permissions, read, write and own. And the permissions show you what you can and can’t do with a piece of data. So this vector is readable, like you could get its length, the vector is writable, like you could add an element to it. And the vector is ownable, meaning you could say transfer ownership to someone else or deallocate it.

And these little icons indicate the reason that a change in permissions is happening. So the up arrow is that a variable is initialized. On the next line we make a reference num = &v[2]. You’ll notice this little orange dot which indicates that the expected, what the expected permission for an operation is. To create an immutable reference requires read permissions on the variable v, which we have.

And then in terms of the permission state, this operation does two things. First, it removes the write and own permission from v. So that means we cannot mutate or deallocate the vector through the variable v for the duration of this reference. And it also creates permissions for both the defined variable num and for *num. So the idea is that you have one set of permissions for the reference itself, and then one set of permissions for the data access through the reference. And because this is an immutable reference, you don’t have write access to the data inside.

Once num is no longer used after the print line, then v regains its write and own permissions. So this shows the idea that permissions are tied to the live ranges of variables, a fact which is otherwise invisible in this source code. And then finally, we are allowed to push to v on the next line and then v loses its permissions once it is no longer used.

So that is the permissions model of ownership. It emphasizes how the compiler abstracts operations into these three permissions. Everything can be thought of as either read, write, or own. And also how these permissions change as borrows come and go from the program. These permissions provide a more general set of rules than just looking at individual corner cases.

This visualization is entirely constructive in the sense that you can see it and look at it even when your program doesn’t have any errors. Like normally, Rust’s type checker doesn’t say anything when your program compiles, but this way you can peek into the state of the borrow checker as it’s running over your program.

And this idea can be made precise. So we’ve formalized this as an extension of the borrow checker, which is the subsystem of the type checker that deals with ownership. And we are able to automatically generate these diagrams for any Rust program that you put as input to the tool. You can try it out if you want. We have a live playground at cel.cs.brown.edu/aquascope, which is the name of the tool.

And one final thought about this tool, it’s interesting to compare it to other visualizations of ownership that people have made throughout the years. So if you look around the internet for ownership visualizations, you’ll find a half dozen diagrams that all look kind of like this. The unifying feature of these diagrams is that they all draw a lot of lines. And those lines represent the live ranges of variables.

But you’ll notice a conspicuous lack of lines, at least lines that depict lifetimes inside of our diagram. Why is that? Well, I think that the people who design these other diagrams have a tacit theory of what makes ownership difficult, right? Their theory is that reasoning about the lifetime of data within a function is the thing that makes ownership hard. And that just wasn’t at all our theory.

Like, our theory was that the hard parts of ownership were things like how borrows changed permissions, how permissions existed on variables and not on data, how permissions extended to fields of data structures, lots of things which you can again see more about if you read the paper or check the tool.

And we arrived at this theory not just by guessing but by actually studying the behavior of Rust programmers. So this visualizer was really just one component of a much broader project to help people learn ownership. And I wanna briefly zoom out to give you that bigger picture.

So before we built any tools, we started by investigating why do programmers struggle with ownership? To answer that, we collected questions on Stack Overflow that were the most commonly referenced questions related to ownership. So these questions contained code snippets that represented common problems that people encountered in practice.

And then we did a qualitative study where we asked 36 Rust developers to understand these snippets. By understand, I mean we would ask them questions such as generate a counter-example. You know, give me a case where if the compiler didn’t reject some problematic function, you could cause a memory safety issue if one existed.

And our core finding in this study was that Rust developers who learned from this existing pedagogy, this sort of ad hoc pedagogy, lacked a depth of understanding that they, for instance, when the compiler was protecting them from something bad, they couldn’t articulate what that bad thing was. Or when the compiler was being overprotective and rejecting a valid program, they couldn’t, they would just assume that they were wrong, that the compiler was always right.

And so we needed some kind of richer pedagogy that embodied in both the permissions model but also other tools that would bring together this more holistic understanding of ownership. So that was what we wanted to do, develop a conceptual model, a way to think about ownership, and a pedagogy grounded in that conceptual model.

So this had two pieces. One is the compiled time diagram I already showed you. We also built a tool to visualize the runtime states of Rust code in a somewhat abstracted way that actually generated the stack and heap diagram that I showed you earlier. But these diagrams also aren’t enough to teach someone about ownership, they’re just a diagram.

So we had to embed them in a broader pedagogy of ownership that we wrote up as an entirely new chapter on ownership in a fork of “The Rust Book”. So we took “The Rust Book”, we ripped out its ownership chapter, put in our new one with these diagrams.

And then finally, we still have to ask did this all work? So we evaluated the pedagogy. What we did is we hosted our fork of “The Rust Book” on a public URL. So you can also check that out, rust-book.cs.brown.edu. And we embedded multiple choice quiz questions about ownership into the book based on the open-ended questions that we asked in part one.

We then compared how readers answered these questions after reading sort of the baseline Rust ownership chapter versus our new one. And our primary finding is that the ownership pedagogy based on the permissions model improves scores on these questions by nine percentage points on average. So we have at least limited evidence from this initial deployment of the book that this new pedagogy demonstrably improves learning outcomes.

And it was really cool to see this project all the way through from like start to completion. You don’t just, it sort of frustrates me a little bit how with a lot of these other tools, people will simply throw a visualization out there in the world and say I bet this is useful, right? But like a real science of human-centered programming has to take it to completion that you should be driving these projects with some real sense of where user struggles are as well as an understanding of cognition, how people learn.

And you need to be evaluating these projects not in terms of people said they liked my tool, but like no, really actual positive outcomes. Like, people were able to answer quiz questions or something more real at a higher rate.

So we ended up using actually this forked textbook to run a bunch of other experiments on Rust learning. We’ve had about 150,000 people use this book in the last three years. So we have another paper you can check out if you’re curious about those experiments called “Profiling Programming Language Learning”.

Anyways, that’s what I wanna say about teaching folks ownership. And now I want to talk about our second project, a trait debugger. So this first project was about learning Rust and the next two projects are about helping Rust developers work with complex code bases. This particular tool is about traits, which are Rust’s mechanisms for doing stuff that look like object-oriented programming. And specifically, this tool helps people debug trait errors and to motivate that problem, I’m gonna start with a brief intro if you haven’t seen either traits or type classes in other languages.

So a trait in Rust is an interface that specifies some functionality, like the ToString trait here contains one function signature, to_string from self to string. Then type traits can be implemented for types. So here we can implement the ToString trait for the 32 bit integer type i32. That will have some function that converts numbers to strings and its implementation is not important but it would go there.

And these implementations can also be generic. So I can say for instance that ToString is implemented for all pairs of types S and T where S implements ToString and T implements ToString, giving me some other implementation.

So once you have these implementations, you use it in two parts. So the first is to define some kind of generic function. So for example, print items here takes as input a list of items where all of the items in the list implement ToString. And then I can loop through that list and print out item.to_string. And I’m allowed to call ToString because I said it must implement that trait.

And then we can invoke print items. So for instance, here I can pass it a list of tuples of numbers and this is a perfectly valid program. Now the thing I wanna draw your attention to is what the type checker is doing when it’s looking at this function call to print items.

So it sees you calling print items with the type i32, i32. So it says does that type implement ToString? It looks around and it finds an implementation for pairs. It says yes, i32, i32 implements ToString if both of its components, which is just the same i32 implements ToString. So does that implement ToString? Look around, yeah, okay, I see that there’s an implementation specifically for that type. We’re good to go.

This, so inside the compiler there’s a component called the trait solver, which is doing this logical inference process every single time that you need to check a trait. And now I want you to imagine we try to use a type that we haven’t implemented yet. So here, like we print out a Boolean, a number, the compiler will reject this code with the following error.

It’ll say the trait bound bool: ToString is not satisfied, okay? Pretty good error. It effectively blamed the component of the type that was problematic. And we get a lot of great bonus information. It tells us the source code of the function containing the call. It tells us where that source code is located. We get the which argument contains the problematic type. We get which part is the actual function. We get also what all the other types are that implement this particular trait.

And that’s not all, we get the provenance of the trait that tells us, oh we were looking at this bool: ToString because we needed to implement ToString for this tuple type. And we get source map information for that obligation and more, it actually continues going off screen.

Let’s reflect for a bit on the purpose of this diagnostic. I would say it has three goals. One is to blame a particular part of the program for the high level error. The second is to provide context to say that you know, you’ve got something wrong but you might need this extra information to understand how you got there. And then the third is to suggest a fix, that maybe you meant i32 or maybe you meant another pair or something.

I think this diagnostic laudably accomplishes all three goals. And after all, you may have heard Rust is renowned for its good error messages. But it should be a little concerning to you. I set up a very small problem. There was one trait and two implementations and we already have a diagnostic that doesn’t fit on the screen.

So indeed, as your dependencies get more complex, so do your diagnostics. Just as a concrete example, here’s one I encountered in the wild. If you’re writing a Rust program that works with a SQL database, you might be tempted to use a library called Diesel, which is designed to use traits to check for malformed SQL queries.

And this diagnostic is generated from a function which inserted data into the wrong table. It caught that error. Awesome, we love it. Parse, don’t validate, but that’s a lot of code and my favorite part of this whole diagnostic is if you squint at a handful of places, it’ll even tell you the full name for this type has been written to a txt file.

Now you all laugh, but I know you’re crying. You’ve seen errors like this, you all deal with advanced type tools, it sucks, right? And we can’t just grin and bear it. This cannot be the future of programming when it comes to advanced statically typed programming. This, my take, this is an information visualization problem.

There is simply just an enormous amount of info about both the trait inference itself and all the metadata like source mapping. And our take is that we need a more powerful graphical medium than the command line to deal with this problem to show you this information.

So let me show you what we built. That’s right, it’s live demo time. All right, so first we’re gonna take a look at the ToString program. This is the exact same source code that I showed you before and this is the same call to print items that doesn’t type check.

So we built a tool called Argus that lets you take a look at what the trait solver is doing. So let me open up this obligation. All right, so the idea here is that this, Argus is visualizing the structure of the trait inference tree. So the original bound they wanted to solve was bool, i32 implements ToString. I can click into that and see, okay, this is satisfied if this pair implementation is satisfied, and that’s satisfied if a couple of things are true, you can ignore these size bounds.

The two relevant ones are i32 needs to implement ToString, which it does, and bool needs to implement ToString, which it doesn’t. And so this shows me all of the information extracted out of the compiler in an interactive way that doesn’t put all of it in your face at once but allows you to explore it as relevant to the particular task that you have.

And there’s a lot of small details here to help make the information as compact as possible. For instance, the where bound on this is abbreviated by default as well as the generics up here. So if you care about this information, you can go check it out but by default, that’s hidden from you. Or likewise, if you wanna see all the types that implement ToString, you can click this little box and it’ll tell you i32 and pairs implement ToString. But that’s not shown to you by default. You have a means by which to ask for it, you ask for it and you get the information back.

And also, we don’t have to show one view on this trait inference tree. The top-down view corresponds to the thing I showed you on the slides. But you can also view this tree starting from the leaves. This is more how Rust’s diagnostics want you to think. So at the bottom of this trait inference tree is the failed leaf bool: ToString. And I can pop back up until I get to the root failure if I want to see it that way.

And our hypothesis is that you need different projections of the proof tree combined with these various interactive mechanisms for compacting the information out of the compiler, presents the developer an interface that is more useful for debugging these kinds of issues in practice.

Now this error won’t show it ‘cause it’s too simple. So we’re, I’m gonna show you an example of where Argus is a little bit more helpful. And for that, we’re gonna need to make a game. So I’m gonna show you a quick example of making a game with Bevy.

So Bevy is a game framework a lot of people like to use for Rust. And the basics of Bevy is we can create an app and then I can add a system that’s going to be a, so a system in Bevy is a function that takes as input things that Bevy understands. So Bevy in a sort of dependency injection type of way will pass this function, the inputs that it’s asking for and then it’ll process them and save some states.

And so just as a concrete example, we can say let’s make a setup function and then over here we can have some commands. This is a special type in Bevy that understands. So we could say, let’s say we wanted to have derive resource, we’re gonna have like a global timer or something, you know, so I could say commands.insert resource timer.

And now I have this, a resource in Bevy is like a global mutable variable that I can access in any of my systems. And I only have one of, and there’s another example of that. Let’s say maybe we have meshes. So we wanna say let’s take as input asset mesh. This is a, so ResMut is a wrapper type in Bevy that says the thing you’re asking for is a resource. And I’ll give you a mutable handle onto this global data. So this is like a global store of meshes. So I can say meshes.add, you know, Sphere::new .mesh.

All right. So this is a perfectly valid thing I can write. I allege that, we’ll see if I, craning my neck managed to actually get it right. Booyah, all right. But a very common error that a Bevy developer or a new Bevy learner or someone experienced like myself, I’ve done this too. A common error you might make is, oh whoops, I just forgot that container type that was really essential that Bevy needed in order to understand what the system was trying to do.

All right, so now I’ve somehow violated the invariants of the system. Let’s see what, how Bevy responds. Check, whoa, hey. Well, look at that, we got a compiler error, right? That’s statically type programming at work. So it did catch that we passed as input a system to our app that it doesn’t get.

But now let’s take a closer look at the error. It says this big function type doesn’t describe a valid system configuration. Thank you. And it says the reason is that it doesn’t implement some giant trait into schedule configs and then it provides a bunch of useless help and then tells us about a bound that doesn’t help us either. Oh, and we also get a type written to text, that’s extra handy.

This is a real problem in Bevy as well as many other communities that use trait-heavy crates and the fundamental, so some more intuition on why this is a problem. It does, Bevy actually can’t blame a specific trait bound because there are multiple ways that a system can be or that an input type can be turned into a system. And so Bevy doesn’t know which one of multiple possible competing implementations is the one that you were intending.

And so it ends, or sorry, excuse me, the Rust compiler doesn’t know and so it doesn’t try to disambiguate for you. And if there are any Rustaceans in the audience thinking about the orphan rule, let me just, think of it more of as an orphan guideline, it’s you can work around it as crates, like Bevy do.

So when you’re in a situation like this, you get an error that’s decent, it does tell you at least which function was wrong, but it can’t localize the problem effectively. And instead, nope. Instead, go back. We can look at this in Argus.

So if I open up Argus and then I pop open the problematic trait bound, it’ll load and voila, if we look at the bottom-up view inside of Argus, it’ll tell us all of the different ways the different possible leaf failures and in this case assets, Assets: SystemParam is the one we actually care about because that particular type does not implement this subtrait SystemParam, meaning something you can pass to a system, but it also shows you the other things you might've meant. Like, maybe you wanted to manually implement this system trait for this setup function or something like that.

All of that information is available to you if you need it and it doesn’t have to sort of hide it away or throw the information away because your diagnostic would get too large and it also provides even more help. For instance, if you don’t know what SystemParam is, you can hover over it and you can see the fully qualified path to the type. You can click on it and it’ll jump to where that’s defined inside of the code base.

And if you wanna understand the trait inference process better, you can jump over to the top-down view and see the sequence of trait inferences that would have led to it being satisfied if you got to that point.

And so this, I hope you get the sense of how this is sort of immediately useful. If you were a Bevy developer, you would find this bottom-up view more informative by at least pointing to something more useful than just the top failed bound. But the other thing I want you to think about is like how do we visualize types? That to me is one of the core questions of this project. And like big blobs of text cannot be the answer. Like, we can only take pretty printing but so far.

So this was a first step in the direction of what would an interactive interface look like for types. But I think that there’s so many more things we could do if we released ourselves from the worldly constraints of the command line.

All right, back to the slides. That’s the basics of how Argus works. I should mention I talked up being systematic. We did run a user study to evaluate whether this actually helped developers where we gave them various trait-related errors with and without Argus and we found that the developers with Argus could localize an error, like pinpoint a specific root cause and blame that about three times faster versus just using the compiler’s diagnostics.

We’ve gotten some great early feedback from library developers who’ve said they found it really helpful for diagnosing bugs in their own libraries. And this tool is of course free and open source integrated into VS Code if you wanna check it out at cognitiveengineeringlab-, or cognitive-engineering-lab/argus.

And now moving on, let’s talk about program slicing. So for the last project, I wanna look at a different problem which is one you’ll find in any programming language, not just Rust. So let’s say you’re trying to understand a complicated function and you only care about some aspect of that function, how can you filter out code that isn’t relevant to your goal? You’ve probably experienced this at some point in your software engineering careers.

Well, there’s a mechanism actually for that called a program slice. I learned about this and said hey, we need to have a tool that does program slicing, a program slicer. And so I built one and it’s just easiest to explain what it does by showing it to you.

So, oh that was anticlimactic. There we go, back to the demo. ♪ Do, do, do ♪ Oh no, that’s because it’s on the other screen. There we go.

All right, so the particulars of this code are not super relevant but at a high level, this is just an example of a Rust function that’s parsing a URL. So up here at the top, we take as input a URL as a string and we return some kind of like structured URL thing that understands each component. And the idea is that this function has both shared state but also it has a lot of independent components that are doing different things. So like, parsing some parts of the URL aren’t related to parsing other parts of the URL.

And you know, hypothetically say you encountered some unit test is failing and so you knew that some component of the URL at the end was wrong, like ah, I’m getting a wrong path. Then normally to understand how this thing is defined, you would just have to read through the code to figure out why this variable was going wrong.

But with my tool, with a program slicer, it’s as simple as focus mode. So, might’ve missed it. So normal code, focused code. The idea is that when you click on a variable, so I’ve clicked on path here, then it fades out all code which cannot affect the value of that variable. And so for instance, the thing that defines path is relevant up here, it turns out there’s like a bug in this code because this thing which isn’t supposed to modify path actually mutates it down here and then all this code which can’t affect its value is faded out as well.

You can click around if you want to understand like, say, where’s hostname used in this function? You can see, oh it doesn’t have any effect until I get to the very bottom. And then we take, put it as input to this URL structure.

So this is called a program slicer because it’s showing above this code all the things that could influence its value and after this code all of the things that it could influence. Hopefully, it’s a relatively simple concept. I’ve actually done some studies and people pick this up almost instantly. Like, I have someone try this tool and then I take it away from them and then they start like clicking on things and they’re like, why is my code not changing? So I think it really, it slips into the background very quickly.

What’s more interesting about this tool is why doesn’t this exist? Like, shouldn’t this have already existed for Python, for Java? Like, this seems like a good idea, let’s implement it everywhere. But it turns out what’s interesting about this project is that it’s implemented in, it uses an aspect of Rust’s design that is relatively unique to the language. And I think it shows an interesting way in how you can use a language to, a language is designed to build better developer tools that it wasn’t initially designed for.

So let’s make that concrete. All right, and to make that concrete, let’s imagine you wanted to build a slicer for Python and as a running example, say we have a vector with a string, my favorite. You get a string from that vector s = v[0], you append a world to that string, you print out the vector.

And if you were implementing a slicer, you would be tasked with answering lots of questions day in and day out of this form, does this print depend on this append? Like, if I clicked on v for instance, would the line s += “ world” show up in the slice of that vector? Who knows, yes, no? Let’s get thumbs up if it’s a yes and thumbs down if it’s a no and you can do this thumb if you dislike Python.

We’re pretty split, fewer dislikes for Python than I expected, that’s good. So the correct answer is it doesn’t, actually, because the way that this works is that if you mutated this vector directly, if you say v[0] += “ world”, it would affect it. But if you pull the string out and then mutate that, that is no longer an alias. The mutated object is no longer an an alias to the original.

And the point here more generally is that building slices is a little subtle and specifically, it’s built on understanding the aliasing structure and mutation structure in your language. So what I mean by that is like if I do something plus equals something, then whatever’s on the left hand side is mutated. That’s pretty straightforward. And when I say something like v[0] that is an alias of v, it refers to part of it, but in this case s does not alias v.

And so what you’re trying to do in a slicer is say when I mutate something or when am I mutating something? And then when I mutate something, what is it aliasing? And then all of its, and then I sort of make all of its aliases, mark them as dirty or influenced or whatever.

But the problem, well, one problem, this is an interesting technical area, but the particular problem that I was focusing on is in a lot of code, you have calls to mystery functions, which could be code in another language, it could be a higher order function, it could be a closed source library. There’s a lot of reasons why you may not be able to look into the definitions of these functions. And then how are you supposed to know whether mystery one and mystery two cause an effect on v?

Put another way, how can we compute program slices through black box functions in a way that’s static and sound and precise? And it turns out that Rust makes this problem tractable.

So let’s set up a similar problem as we had with Python, but this time in Rust. So we again have a vector with a string, we can then get a reference to the first element of the vector, which I will spell out as .get_mut .unwrap. We then push into s and then print out v. Unlike Python, here the effect on s does actually affect v.

So what we need to figure out is how are we supposed to know that the push affects the print? And again, the constraint is that we, excuse me, the constraint is that we wanna do this without figuring out, without looking at the definitions of push and get_mut.

And that really requires understanding two questions. How do we know that get_mut returns a pointer to v specifically? And how do we know that push_str mutates its input?

And so the seed of this answer starts by looking at the type of the variable s. So in C++, s has a very simple type. If you wrote a similar program in C++, it would just be *string. Boring. In Rust, it has a much more fun type. It says &’a mut String. This is the stuff that scares people away from the first part of the talk.

So that’s a lot of extra information, but it’s important for our purposes. So that ‘a is a lifetime. It’s a formal concept in Rust that it uses to analyze how long a pointer will live. I didn’t talk about it too much in the first half, but there’s a, that’s the other half of ownership that’s hard for people to learn. And this mut keyword is the mutability modifier. You did see that part earlier.

And what’s cool is that we can repurpose this information for slicing. So for example, how do we know that get_mut returns a pointer to v. If you look at the type signature of get_mut, it says that the input vector v must share a lifetime with the output element s and it’s from that shared lifetime we can infer an aliasing relationship.

And then we know that push_str mutates s because push_str requires a mutable reference to its input. Whereas for instance, if you ask for the length of a vector, that only requires a shared or immutable reference. So we can assume that will have no effect on the vector.

And honestly, that’s really the core idea, using ownership types, we can compute slices going through this black box code and we can make more precise assumptions that with this extra information that we wouldn’t have otherwise.

Now you might ask like how accurate is this technique? You know, for instance, this technique might make up a dependency where none exists. Like, if you gave me a function that takes a mutable reference and doesn’t do anything with it or doesn’t mutate it, that’s sort of inconsistent with our assumptions. Or, this technique might miss a real dependency. Like, if a function can mutate something through an immutable reference, that would break our assumptions too.

And so one cool thing about this tool is we are able to answer both of these questions with sort of a mixture of mathematical and statistical techniques. So we would want our analysis to be what we call sound and precise, sound meaning that real dependencies in the program are found by the slicer and precise meaning the dependencies found by the slicer are real.

So this slicer is sound in the safe subset of Rust, which we can actually prove mathematically by embedding this analysis into a formal model of Rust. But it’s unsound in the presence of certain features implemented in unsafe Rust, like interior mutability, which is how Rust models things like mutexes that need to provide mutable access to shared data. So basically what that means is that in rare cases as a user, you might click a variable and the tool will fade out more code than it should.

And then for precision, we can answer that too. This one’s a little harder because it depends on how often developers write code that makes the tool imprecise. And so to look at that, we actually ran an experiment where we made a version of the slicer that’s allowed to peek into the definitions of the functions that you would normally not look at. And then we can compare these two versions on 400,000 lines of Rust code from real projects.

And what we find is that when you’re allowed to peek inside of these functions, you don’t actually learn that much in terms of the simple questions we’re trying to ask, that these produce the same answer in 94% of cases because for instance, if somebody takes a mutable reference, they’re likely gonna mutate that data. It’s uncommon not to do that.

And so that’s how we build a program slicer for Rust. We use Rust’s type system to make the slicing problem more tractable. And the goal is that it can make it easier to filter out irrelevant code in large Rust programs. This tool is also free and open source if you wanna try it out for yourself at github/willcrichton/flowistry.

All right, so couple things before I wrap up. First, this was all made possible by a host of collaborators, advisors and funders. I’m gonna briefly call out Gavin, now a PhD student at Brown, was the lead developer on both the ownership visualization and the trait debugging projects. Those two projects were also supervised by my mentor and now colleague, Shiram. The slicing work was done in collaboration with Marco and advised by my PhD advisors, Pat and Maneesh.

The experiments with “The Rust Book” were made possible by the contributions and support of both Carol Nichols and Steve Klabnik. And my thanks to the entire Rust community who appreciated, who participated in that experiment or my many other user studies. And all of this research was funded by the NSF, DARPA and Amazon Web Services through Niko Matsakis, so a big thanks to everyone on this slide.

Now, before I finally wrap, I just wanna leave you with a couple reflections that I was thinking about today about this work and how it relates to the current trends inside of developer tools and what other things you might take away from this talk.

So we have to place this in the context of everyone’s excited about AI agents like the modal thing that someone is making or funding or whatever when it comes to the future of programming is some kind of tool that you can hand a pull request or a GitHub issue and it’ll automatically fix it for you or something, which is fine. It kind of sucks that it’s often phrased in this form of like, these things are coming for your jobs and programmers are no longer necessary.

I think a much more interesting and productive framing is to think about augmentation rather than automation, especially because people have been fearmongering about automated programming literally since the beginning of computer science. Like, the first compiler built by Grace Hopper. People were looking at that going like, this is garbage. A compiler can’t spit out octal code like people can. That’s a human job.

And so I am, no matter where these tools go, I think it is more useful to think of them as how can we make cyborg programmers more effective rather than how can we just get rid of programmers entirely?

And related to that, cognitive science, I think there’s a sense in which you could just like ignore humans entirely. If we just focus on building the machine God, then we can like forget all of our worries and let them take over. But the thing is, as long as humans have agency and care about the computers that we’re working with, we will need an interface into them. At some level, excuse me, at some level, these tools will need to explain to us the behavior of this code. We will want to understand what they’re doing.

And that means you have to understand human cognition in order to effectively communicate the behavior of code to someone. Whether that is information visualization, technical communication, like there’s a whole host of techniques, but they’re going to be useful until the final day at which we can, you know, let ourselves out and like robots will take over humanity. But I’m positive about ignoring that possibility and trying just to focus on the happy thoughts.

The other thing I want to note is that with this AI stuff, of course the big issue is reliability, right? And this like programming language theory is almost the complete opposite in when I think of in the theoretical space of the way that machine learning folks like to approach problems. PL theory is about building tools that we understand so deeply I can tell you categorically that this tool will always find your dependencies, or it will always miss them in certain cases.

And so I think I’m very optimistic about using PL theory to build more effective dev tools that specifically can have a much greater reliability than you might find through, you know, modern AI techniques.

And the last point I wanna note is people think about dev tools often for experienced developers. You’re building visualizations, debuggers, loggers, whatever for big, complex code bases. But I actually don’t think there’s enough work on building developer tools for learners. Like, you know, the Aquascope diagram, the ownership visualizer, that doesn’t scale. You can’t produce a diagram that makes clear sense on 300 lines of code, but it doesn’t have to.

And there’s still a lot of value in having a dev tool that’s formally defined that’s precisely defined so people can play around with it so there’s no confusion about what it means. And I think there’s a just a big, open space, especially I think the interesting thing for learners is there’s the space of ways people can misunderstand a programming language is very large, right? There’s a very small, constrained space of like correct mental models and there is a very big space of all the things that people can not get about what you’re building.

And understanding that space, building tools that can model how people think and sort of contextually give them the information they need to overcome their misconceptions, that I think is just a really cool area of dev tool development right now that I would love to see more work in that space.

So those are just a few thoughts. Come ask me afterwards if you’re want to chat about this stuff more. I have lots more, but I’ll end here. Thank you for listening. All right, gimme your hardest questions. I think we’re throwing this thing.

Yeah, so thanks for the talk. This is really fun, especially as someone who used to write a lot of Rust. I’m curious about your, about async specifically. ‘Cause I feel like that’s a really big hurdle for Rust developers, is writing async code and specifically the sort of like cross combination of async traits and lifetimes gets especially tricky. I remember trying to write like a retry function that takes in an async function and does the correct things with the lifetimes and just getting very stuck on that. So I’m curious your thoughts on that space.

Yeah, I agree that async, one thing, actually, this is notable. So if you look in the Rust survey data for most of history since the surveys began, Rust survey said that ownership was the hardest part of the language, but it was dethroned this year by async. So people are getting increasingly concerned about this question too.

And we just started a project looking at the complexities people encounter with async. And I mean, everything you mentioned, I do, I think is a big like usability pitfall. But to me, the more fundamental problem, especially that we’re thinking about right now is like what the purpose of an async system is, why and when you want to write async code and the relationship between sort of like the mechanisms of an async system like coroutines and promises and the things you’re trying to do with it, like implement cancellation or so on.

There’s, I think people don’t understand this like nexus of concepts because it’s evolved in a very incremental way over time. And there’s a lot of subtleties that we’ve come to appreciate as we like delve deeper into this. You know, like a good example is like if I spawn an async task and my function returns, what happens to that task? Like, should it continue running? Should it get canceled? Should you wait until you, until that task completes? You’ll find async runtimes in different programming languages that adopt all three of those strategies.

And so our actual current focus is just like characterizing the async design space so that way we can better, like what are all of the relevant concepts you need to be equipped with to understand how Rust is designed, but also to then to go in and understand how like Swift is designed or Kotlin or whatever, because they’re all taking very different approaches. And I think a big issue is just people aren’t equipped to know what to look for. There’s so many idiosyncratic design decisions that have gone into each system that have yet to be articulated.

And so we are focused on like trying to even just put down the design space into words so that we can start to look more concrete being like, oh, the issue is that people don’t understand how like coroutines model futures or something. But the high level space itself is so fuzzy that that’s actually where we’re starting, yeah.

As someone who has spent too much time reading about cancellation safety in Tokio, nice.

Yeah, other questions?

Hey, thanks for coming. So you mentioned that like Rust is like a target of a lot of this work and also talking about how like programming language theory and Rust’s static type system and ownership help you develop tools here that make it easier for the learners, but also people are writing lots and lots and lots of code in dynamic languages. I guess there’s been some effort recently to try to adopt more, like more typing features in those with gradual typing and stuff. But have you thought about how some of this might translate to dynamic context or is it kind of just hopeless?

Yeah, so I do, like I’ve thought about gradual typing a little bit ‘cause it’s an area of particular interest to me. I have friends who find languages like TypeScript extremely frustrating in particular because there’s a big gap between the academic approach to gradual typing. Like, oh, versus the actual practice of TypeScript, which is more like, we’re gonna build this massively complicated type system and sometimes it won’t work. You’ll just write unsound things and that’s cool ‘cause you won’t, it won’t happen that often.

And I think that’s interesting because that’s, here you have one of the biggest gaps between how theoreticians wanna think about programming languages and how they’re actually implemented. And I don’t think we have yet a handle on the right way to sort of address that gap. You know, a concrete instance of that gap or like another instance of that gap is that it’s, there’s like a phenomenon that happens where you can just turn off your type system whenever you want to, like, oh, cast this to any, like my compiler’s complaining at me and make it an any, move on, right?

And that’s a problem. You shouldn’t just be able to do that. And in particular, it produces this weird situation where when there’s velocity constraints, like I need to ship this feature by next week, and the compiler’s yelling at you, if you can work around the compiler, you will, and that will just accumulate tech debt down the line.

So I think there’s almost like a macroeconomic question about how do you balance the need for velocity with the, you know, accumulation of tech debt that will eventually result because like Rust has the opposite problem. Rust will not let you compile lots of programs and you, and people get frustrated and go like, I’m just not productive with Rust because I can’t write code fast enough, I can’t ship features fast enough. But if you’re working on like a really sensitive code base, maybe that’s not an issue for you. You go, we wanna ship slowly because the goal here is not to have any bugs.

So I feel like part of what’s missing is a broader analysis of like developer productivity relative to type system design that just no one’s really done effectively, in part because it’s very hard to ask those kinds of questions in a sort of semi-controlled manner. Like we just don’t have enough economists in programming language theory, I suppose.

So I think that’s the area I would be interested in investigating is like, how could you actually adjudicate the static types debate? How could you have better, produce better evidence on like when it’s actually important to have these types to catch real mistakes versus they’re just slowing you down, for instance. You know, those are the types of questions we’re not equipped to answer, but I think that’s the good direction for future research.

Do you think that the tools that you showed us or similar tools which are useful for programmers to write programs or debug compiler errors and so on are also useful to a similar degree for like an AI agent or an LLM to write programs?

Yeah, so some of them, like actually I was talking with Niko about this, the trait debugger, and he was like, oh, you should have a model context protocol thing where you could spit out the entire trait inference tree into some textual format that an LLM could handle and it might be able to also provide hints or otherwise augment your interface. So I do think there’s a sense in which if you build compilers that are less black boxy, that are willing to expose those internals, you could pass those internals to a language model or something and it might be able to help you.

I don’t think, for instance, giving a language model one of these diagrams is better than just feeding it a textual representation. I think the flip side though is these models could have better tools for diagramming and talking to you. Like, it’s sort of insane to me that people learn stuff in markdown. Like when you go and ask ChatGPT or whatever to explain something for you, it is constrained to like whatever markdown is allowed to visualize. And that’s a very limited subset of technical communication. It can’t do a lot of interesting diagrams or like run experiments or things like that.

And so I’m personally interested in ways in which you could enrich the communication language of your assistant tools so that way they could, like, give you more interesting kinds of explanations than just like text around code snippets.

So I think oftentimes when people who are new to like a programming language with advanced features, there tends to be like the development of like some forms of like workaround. So a classic one in Rust is like just clone everything if the ownership checker, borrow checker is getting mad at you. And I think there’s like a similar thing maybe going on with like the work that we have done internally with OCaml, which is like, if you don’t need the advanced feature, you like don’t need to think about it, but then maybe one day you do need the advanced feature and like you don’t know what to do in that circumstance. And so I’m curious if you have thoughts about how to like, take people from these sort of workarounds that they have towards a better understanding in like a more dynamic way. Yeah, more continuous way.

Yeah, it’s a good question ‘cause like a lot of people have asked me, oh, could we make some version of Rust that gets rid of ownership, that like automatically does the cloning or add, it’s automatically reference counted, but otherwise it’s Rust. So you could start folks learning in that sort of like simpler language and then they could gradually transition to real Rust.

And I think that’s interesting. The risk you run for like one of the programs we looked at in our ownership study, we would find that people would clone in a way that was actually incorrect. Like they would, they were trying to mutate an object in place and it was very important to mutate that specific object in place. They would run into a borrow checker error, so then they would clone that object and mutate the clone and then the function would return and nothing had happened, right? Because they mutated data, they cloned and dropped it.

And that’s a good example of where like you can end up in situations where if you haven’t architected your code correctly, then your workarounds will fail. And so I mean from that perspective, it’s sort of just like you, I think it’s ideal if you can teach people the foundations from early on. I think for something like Rust, you can’t avoid ownership because it’s so pervasive throughout the language. It’s better to learn it that way. Although like having, you know, clone something after the fact is a useful workaround as a last resort.

But I think this other idea of, if you can articulate a sort of coherent language level, like what it means to have the entirety of this language without this feature and then ideally smoothly transition between the two of them, that would probably be the, like, one good step forward. The challenge with that in something like Rust is it’s not fully clear what it means to say what if Rust were garbage collected because there are key elements of Rust, for instance, that require deterministic deallocation. Like, the drop feature for mutexes is an important part of how mutexes are designed. And if you rely on the garbage collector to unlock your mutex, that is a. Yeah, that would be a concerning design decision.

So I think that’s the sense in which there’s no obvious or trivial way in which to, for instance, like lift 85% of Rust semantics into a higher level language. But I think if you could solve that problem, that would be a really helpful tool, especially if you could solve that problem and then establish a gradual adoption path. Like you could work in GC Rust and then like slowly add ownership-y bits or something and that, and there was a well-defined transition point. I’ve been always interested in what like gradual garbage collection would look like, but I haven’t seen many interesting prototypes in that direction, so.

You should read ours.

Yeah.

What tools could the languages you work with give you to make your job easier as someone who builds visualizations for them?

So the biggest offender is where’s my cursor? This guy, Visual Studio Code. This is single-handedly holding a back progress in programming for years for two reasons. One, everyone uses VS Code. This is, it’s like I checked the Stack Overflow survey, it’s something like 80% of adoption, like 80, like a lot of people use this. And the problem with VS Code is it allows you to do very little in the way of overlays, visualizations, any amount of creativity you wanna put on top of the code, not allowed.

So Argus worked because we could put it in a totally separate panel. But we’ve wanted, for instance, to put the Aquascope ownership visualization in here, but we can’t. So we implemented that with CodeMirror, which was a much better design tool. Like if I could just wave a wand, I would simply make CodeMirror like the IDE that everyone, I would build a feature complete IDE around CodeMirror and then everyone would use that. But I can’t wave my wand so in the meantime, the problem is you’re so restricted from doing visualizations, like so Flowistry here may, was possible because like fading code and highlighting it is within the scope of what these things can visualize.

But in practice, if you wanna do anything more interesting, you start forking VS Code. And God, there are too many forks of VS Code in the world. We don’t need another one. Like for academics, it’s frustrating because then nobody can use your tool. If you’re in industry, well now you have to like maintain this fork and people have to have multiple IDEs that don’t interoperate on their computer.

So I think having a better platform for implementing sort of extensible visualizations and doing more interesting decorations on code is my short answer. And the, it’s just like we have one, it’s called CodeMirror, and we don’t use it. So that would be my longer answer.

Anyways, thank you again for listening.

Rust for Everyone!

Transcript

The next great idea will come from you