Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Mojo Lang - Tomorrow's High Performance Python? (with Chris Lattner) image

Mojo Lang - Tomorrow's High Performance Python? (with Chris Lattner)

Developer Voices
Avatar
3.3k Plays7 months ago

Mojo is the latest language from the creator of Swift and LLVM. It’s an attempt to take some of the best techniques from CPU/GPU-level programming and package them up in a Python-compatible syntax.

In this episode we explore why Mojo was created, and what it offers to Python programmers and non-Python programmers alike. How is it built for performance, and which performance features matter? What’s its take on functional programming and type systems? And can it marry the high-level programming of Python with the low-level programming of LLVM/MLIR?

If you’re a Python programmer who needs better performance, a C programmer who expects more from a ‘scripting language’, or just someone who’d be happier if Python had a first-class type system, Mojo might well be for you…

Mojo: https://www.modular.com/max/mojo

Mojo’s Roadmap: https://docs.modular.com/mojo/roadmap.html

The Mojo Discord: https://discord.com/invite/modular

MLIR: https://mlir.llvm.org/

Chris’s Talks: https://nondot.org/sabre/Resume.html#talks

Chris on Twitter: https://twitter.com/clattner_llvm

Kris on Mastodon: http://mastodon.social/@krisajenkins

Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Kris on Twitter: https://twitter.com/krisajenkins

#software #podcast #mojolang #ml #pythonml

Recommended
Transcript

Can Python become a high-performance parallel language?

00:00:00
Speaker
Could Python become a high-performance parallel programming language? That seems unlikely. There are some fundamental architectural reasons why that feels like a bigger shift than Python 2 to Python 3, and that was a big enough shift already. But I raised the question so that I can flip it on its head. Could a high-performance parallel programming language become Python?
00:00:24
Speaker
That to me seems more reasonable. Start with a low-level, compiler-level language, do your performance work, figure out those parts, and then when you build up into user space and choose a syntax, why not Python?

Introducing Chris Latner and Mojo

00:00:40
Speaker
Now that broadly is the approach taken by the new language Mojo and its creator, Chris Latner. If you've not heard of Chris, I'm certain you've heard of his work. He created LLVM. He created Clang, the other big C, C++ compiler. He created Swift, the language of shiny Apple products everywhere. He has serious form in language design.
00:01:03
Speaker
And he's one of the few people I know that actually could run the gauntlet from Python syntax all the way down to CPU and GPU and back.
00:01:12
Speaker
And he has some really interesting ideas for his latest language, like let's do Python, but bring in a gradual type system. Let's make parallelization a first-class concern. Let's give the programmer direct access to compiler instructions. Lots of interesting new toys in familiar clothing. So I suggest we go and take a look at it. I'm your host, Chris Jenkins. This is Developer Voices, and today's voice is Chris Latner.
00:01:51
Speaker
Joining me today is Chris Latner. Chris, how are you?

The journey from LLVM to Swift

00:01:54
Speaker
I'm doing great. Thank you for having me, Chris. Absolute pleasure. It's not often, not often I get a guest with quite the CV, quite the resume, as you say, that you have. You've got a heck of a backstory, which we could talk about in itself, but we always talk about the future on this podcast. So let me get this straight. You've been part, a founding part of LL
00:02:17
Speaker
Yeah. That was my master's research project. Yeah. That's a good start. You didn't rest in your laurels. You went on to Clang and Swift. Yeah. Also nights, nights and weekends problem. Yeah. Nights and weekends project for a year and a half. And then it kind of grew a little bit beyond that. So was that how it got started?
00:02:35
Speaker
Yeah, yeah. So I mean, I'm the kind of person that I'm working on the thing I'm supposed to be working during the day, but also like pushing boundaries during the evenings. And so in the case of Swift, we were wrapping up the client C and C++ compiler. And C++ is a very interesting, informative language, very challenging technical problem to implement. And at least for me, I could not survive that experience without thinking maybe there could be something new and better. And so this is where

Why create a new language? The motivation behind Mojo

00:03:00
Speaker
Swift started bubbling in the background and eventually.
00:03:03
Speaker
Got to the point where I could understand what it was that I was building and working on and then shared with some other people. I always feel like Swift was the language that saved people that weren't me. I spent a good number of years writing Objective C, kind of drowned and got off that boat, and then Swift came along just as it would have saved me.
00:03:24
Speaker
Yes, which is a great language. It was a lot of fun. I learned a lot from that. So this is the thing. So with a background like yours, there's almost no point asking you why you're writing a new language, because it's kind of it seems almost inevitable that you would write a new language. And what we have to talk about is why this one.
00:03:44
Speaker
Yeah, maybe you could flip around and say, given you know how hard it is, how can you be insane enough to do this again? Well, you look Latin for punishment. Yeah, well, so I mean, I'm not afraid of hard things, I guess. I think that is true. But really was, I mean, you've talked with many folks that have built languages and are working on it and everybody has different motivations. For me, it really comes down to solving a problem.

Mojo's performance-centric design

00:04:07
Speaker
And so I actually resisted building a language. So in this case, Mojo, because having
00:04:14
Speaker
gone through, for example, the Swift experience. Before that, I also built OpenCL, which is a GPU programming thing many years ago. When you start from that, let's go solve the programmer's problem, typically start from syntax.
00:04:29
Speaker
So in the case of Swift, started with, okay, I want auto closures and I want these higher order functional programming things, and I want to be able to compose that in into the LVM compilation flow and all these kinds of things, and very much started from syntax. In the case of Mojo, at Modular we started with a, how do we make GPUs go burr?
00:04:49
Speaker
How do we make these crazy high-performance CPUs that have matrix operations and Bfloat16 and other AI extensions, how do we rationalize this wide array of different hardware and put it into a system that we can program it?
00:05:04
Speaker
And what's changed in hardware is that it used to be that a CPU would come out every year, and it would be slightly better than the old one. And everything would just magically go faster. A compiler might do a different instruction scheduling or certain code generation tricks. But generally, if you recompiled your code, old code ran faster. But today, what's happening is we get all these really weird, dedicated hardware blocks. And these are very specialized. I mean, the most extreme is a GPU.
00:05:33
Speaker
And so being able to program these things requires fundamentally different kinds of software. And so what we started on at Modular is not building a language. Actually, what we started building is a very fancy code generator that had no front end. And so we just built pure compiler code gen stuff, like replacing big chunks of LLVM, replacing and building new ways of synthesizing high performance loop nests and things like this. And then just wrote everything in IR directly.
00:06:01
Speaker
And so we were using, in this case, built a new compiler framework called MLIR. It's now part of the LLVM family, but it's a next generation replacement for LLVM in many ways. And so that allows you to just write IR. And so we did that for quite some months to just prove out that the code generation philosophy in the stack could work and deliver the results we wanted. And then when we had confidence in that, we said, oh, OK, cool. Now what do we do about syntax?

MLIR: The new LLVM

00:06:28
Speaker
Right.
00:06:30
Speaker
Well, I have to ask, what's it like to write MLIR? So MLIR, if you're not familiar with it, it's like, so Mojo is too swift as MLIR is too LLVM. So LLVM is super widely used. It's 25 years old-ish at this point.
00:06:50
Speaker
it was brought up in a mode with a philosophy of there should be one IR to rule them all. It should be the LVM IR that many compiler folks are familiar with. And that was great for a generation of things that look basically like C, or you could de-sugar down into something that looks basically like C. And so LVM is, I kind of joke, it's C with vectors plus exceptions, or C with vectors plus debug info, or something like that. So it's in that kind of category of representation.
00:07:18
Speaker
But again, when you start getting into GPUs, other accelerators, what you end up wanting is multiple levels of abstraction, something more like the Haskell GHC compiler or something like this, where you have many different lowering phases and you're exposing different optimization opportunities to different levels of abstraction. And so in 2018, I led a project to build this thing called MLIR, which is a replacement for LVM that allows making domain-specific compilers really fast.
00:07:46
Speaker
And so it provides a bunch of the core compiler infrastructure, the IR representations, the pass managers, bisecting tools like the ability to parse and print IR and all this kind of stuff, and allow you as a compiler hacker to be able to focus on your domain. And so you can model your design, your type system, you can design the structure of the operations that you want to manipulate and things like this.
00:08:09
Speaker
And so MLIR is a very useful Swiss Army knife for building domain-specific compilers. I've used it in the past to build hardware synthesis tools and all kinds of stuff. And so that's what we're using. And one of the great things about MLIR is it has a text form. And so you can just write the text form.
00:08:26
Speaker
But I'll say writing an error by hand is really painful. That's not actually a good user experience. It's something that you can do to prove a concept, but you never want to expose that to users. I imagine it's a pretty fuzzy question, but is it lower level than C, higher level, just different? It's domain independent. And so you can use it to do machine level
00:08:50
Speaker
code representation if you'd like, and a lot of people do that for for accelerators and things like this. MLIR is widely used in the machine learning co-generation community, for example. But we use it for our AST. And so you don't need to use it for any particular thing. And in the case of Mojo, we don't build an AST traditionally, we actually just generate MLIR directly from the parser.
00:09:14
Speaker
Okay. Okay. So then in that case, we have to go both down into the hardware and see what you did with that. But why don't we go up into user space before we get those leads just briefly. So you've proved out your theories to a degree that you say you want to send this approach to the world. Now it's time to put lipstick on our pig.
00:09:40
Speaker
Let me use the git metaphor. We'll put porcelain on the plumbing. And you've chosen to go quite close to Python, right? Why that?

Challenges in developing Mojo

00:09:54
Speaker
Yeah, so when we decided, OK, this seems to be working, we think we have a novel and useful co-generation approach, we had to decide how we're going to get this IR. And there's a couple of different approaches. One is, and by the way, building a programming language is insane, and we can talk about why. But the obvious thing to reach for is a domain-specific language embedded in some other language.
00:10:19
Speaker
And so you use, for example, Python and use some decorator-based approach or some way to stage out something and then generate IR by walking some other language's AST. The other approach is you say, OK, go build a language. Another approach might be to say, build a source-to-source translator or something like that. And so there's lots of different technologies that we can use. And so in our case,
00:10:44
Speaker
you know returning this world returning this emerging world of high performance accelerators gpu is everybody's dying right now because of how complicated the stuff is and then how challenging the programming model is and so we decided let's do the hard thing let's let's let's eyes wide open we understand how hard it is to build a program language.
00:11:02
Speaker
Let's do that. And we think the cost benefit tradeoff is worth it. And let me tell you why it's a bad idea to build a programming language. I mean, you probably know, but it's not just about building a grammar or a compiler. You also have to build all the tools. You have to build code form routers and you have to build a debugger. You have to build this entire, you have to have a library ecosystem. You have to have developers who care. You have to build the technically difficult thing in the right way.
00:11:31
Speaker
And so there's a lot of design and tech in order. Yeah, you're a package manager at LSP and Slack channel. Yeah. That's right. That's right. And so in the case of Mojo, we said, okay, well.
00:11:41
Speaker
And as you know, I'd been through that with Swift. And so yes, we bike shaded all the syntax. We built all the different things. And so what I realized is that I want to control some of the risk, but also meet developers where they are. And in AI in particular, Python won.

Python's influence on Mojo

00:11:57
Speaker
Like Python is nothing. And so everybody uses Python.
00:12:03
Speaker
If you want to pick anything not Python, you'd have to justify why it's better than Python. I have curly braces instead of tabs. It isn't actually a great answer because at the end of the day, while I and you and many people of programming languages
00:12:20
Speaker
People have muscle memory. And so I care about the hundreds of millions of developers who already know Python, right? And them not having to retrain is actually a huge feature. From an engineering and design perspective, it's also very useful because we already know what the language looks like.
00:12:34
Speaker
So we don't have to bike shed all the things. It's way easier to implement something when you know what it is versus having to first principles everything and have to rationalize this. Now, Mojo is a very extended version of Python. We could talk about what it's all about, but there's still design work, but at least we could focus our energy.
00:12:52
Speaker
Yeah. Most of those early decisions are made for you. Yeah, exactly. And so like, for example, Python uses this fairly goofy postfix ternary thing with, you know, the if goes after the condition and stuff like this. Oh, yeah. Yeah. That is not my personal cup of tea, but it doesn't matter. It is proven. It exists. It works. And if we were to change it, we'd have to massively justify why we want to break compatibility or break
00:13:17
Speaker
familiarity or break all the training that people already have. So it just really anchors the project in a very convenient way. Okay, yeah, I can see it. I have to ask though, if I were building a language that was focused on high performance and optimizing for GPUs and all that stuff, I would be torn between going after the ML data science world and the gaming world.
00:13:39
Speaker
Yeah, you could have gone for those but you could have gone for PlayStation 5 programmers, right? Sure. And there's a lot of money in that. Did it tempt you? Well, so not really, honestly, just because we're mission driven. We have a specific thing with GPUs and accelerators and AI and this problem is burning. I've been in the space for five, six, seven years at this point. And so very much on that track.
00:14:05
Speaker
But without loss of generality, Mojo can be used for lots of different things. We have people building GUI libraries and web servers and everything, so it's a general purpose programming language. We can talk about the nature of the language, but it is fully general purpose.
00:14:20
Speaker
our design points were rooted in our use case and the problems that we were trying to solve. And, um, and there is no language out there that does what Mojo does. And so solving for those, you know, that was another option was just use somebody else's language, right? And so if there was another language out there, it would be great to just not have to do all this work because we could get on with life and focus on other parts of the problem. So yeah.
00:14:43
Speaker
Yeah, I can see that. I can totally see that. So, before we get into what makes it... I will say we do have a number of people on our team that are coming from the games industry, so... Okay. So, you've at least tempted some of that audience across, too. Yeah.
00:14:59
Speaker
So before we get into what makes Mojo different, let me just quickly ask, I've written plenty of Python. If I came to Mojo, how familiar would it be? Would it feel exactly the same until I started using the new things? Yeah, so Mojo's not done yet. It's probably halfway through his development journey, I would guess, roughly. But when it's done, it will be a full superset of Python.
00:15:21
Speaker
And so if you're familiar with Python, then you know about the Python 2 to Python 3 transition. It just about killed everybody involved. It sucked energy out of the community for 15 years. It was kind of a gigantic mess. And the reason that that happened is because it was similar but different and packages could not interoperate with each other. You couldn't have a hybrid half Python 2, half Python 3 program.
00:15:44
Speaker
And so it became exactly. And so that was a huge problem. And so in the case of Swift, what I learned is, I mean, Swift is a completely different language than Objective C. But what we did was we very successfully migrated over the course of years, but progressively migrated the Objective C community over Swift. And we did that by making it so both worlds could live together.
00:16:05
Speaker
You can call one from the other happily. Exactly. And so that way you can decide, OK, well, this third party package I'm using, I'll leave it as object to see, but I'll write my UI or whatever it is that I want to do in Swift. And I'm not in this log jam where everything has to move over. And so in the case of Mojo, we're building it into a full superset of Python. And so all the Python idioms and stuff like that, whether they're a good idea or not, will work in Mojo.
00:16:33
Speaker
But even today, you can import arbitrary Python packages and go to town and use them, and you can mix and match very directly. And so we're continuing to make progress on more dynamic features in particular, because as you say, we're climbing from the accelerator. So Mojo's a really good replacement for CUDA. We're getting into the, it's a really good replacement for Rust. And then eventually we'll get into the, it's a good super set of what Python is loved for. But we have to do those steps and we have to build out the features as we go.
00:17:03
Speaker
Yeah, yeah. No matter what happens when you're writing a language, you're climbing some kind of enormous mountain. And that's the shape of yours.
00:17:13
Speaker
Speaking of the size of that mountain, does that include things like, do you support Python's interop with C syntax and approach? We haven't decided that, actually. So if you're a Python programmer looking at Mojo, one of the things that's really interesting about Mojo is that it is Pythonic and it's growing into all the different dynamic things that Python can do. But you don't have to switch to a different language to write
00:17:42
Speaker
high-performance code, right? So if you think about Python, Python, like Objective-C, actually, since you're a veteran, Objective-C and Python both have the fully dynamic object-oriented, we want all of our APIs to be built in the objects world, but then it has the dark truth of, you know, all the important stuff is written in C or C++.
00:18:04
Speaker
Right. And so a lot of you look at NumPy is just one example. It's a very nice Python API, but it's all written in C, C++ wrapping until MKL and all this kind of stuff. Yeah. Right. So with with Mojo, you can write all of that stuff in one language so you don't have to switch out. And the way that works is it's a progressively typed language and it has real types, not the MyPy types that Python currently has. And so plus it has really fancy compilers, all next generation stuff that that has been created over the last couple of years from
00:18:34
Speaker
you know, learning from decades of other stuff that got built. And so the consequence of this is that you don't have to switch out. And so that integration was C.
00:18:45
Speaker
The reason it's challenging is that Python exposes its full object model to C code. And so we actually fully support that right now because we can talk to Python, we can talk to Python C modules and so that already works. But there's a question at the low level of like, do we want to support that like really low level binding when talking to C bindings or do we want to encourage the community to just switch stuff to Mojo or not? And so we haven't, we'll figure that out. So it may be in the future that whilst you can't use the Python wrapper around the
00:19:14
Speaker
C library, you might find it very easy to go direct to the C instead. Right. Well, in mojo can already call into C and so that's all, that's all good. Yeah. Okay. So you've, you've pricked my ears on one of my favorite topics, which is types.

Mojo's type system and AI integration

00:19:30
Speaker
Yeah. Cause you've got, I read on your website, you've got a progressive type system in mojo and I actually don't know what that is. Cause it's not the same as a gradual type system, right?
00:19:45
Speaker
Yeah, so maybe I'm not being very precise on terms. I'm also, by the way, an engineer. I'm not a mathematician. So just as a disclaimer. But sure, so what's the question? How can I help? So tell me about Mojo's type system. Is it opt-in? What kind of type system is it? How much do I have to learn to use it? Yeah, OK, so all good questions. So start with.
00:20:15
Speaker
How do you get a typed language to talk to Python? So this is the very first question. Then you have to ask, what is the existing Python type system? And if you ask a lay programmer, they would typically tell you, oh, well, Python doesn't have types.
00:20:30
Speaker
Right. And then if you talk to a more advanced developer, they'd say, oh, it has dynamic types. Right. And so there is a list, there is a dictionary, there is a stringer and int, but it's all runtime. Right. Yeah. As somebody who knows programming languages and likes some static languages now and then, we can say Python has one type.
00:20:48
Speaker
And that one type is a reference to a Python object. And because there's one type in Python, you never spell it. And so that's how I would say Python is, actually, it has a static type system. It just has one type. OK. Yeah, I can see the argument. So now if you go down that direction, you can say, let's give this thing a name. And so in the case of Mojo, it's called Python object. We'll decide if we really want to consider it Python object forever or if we want to shorten it to object. But it has a name. It's called Python object.
00:21:16
Speaker
And now you can say, hey, well, I have a new type. I'm going to call it int or string or whatever. And so when I say something is an int, that's a different type.
00:21:26
Speaker
I can have conversions from one to the other, and I can make these things integrate and operate. But if you're a programmer, you can say, hey, this thing is an int. And now, of course, it's not boxed. Of course, it's on the stack. Of course, it's the size of the word on your machine or whatever. And of course, it works the way it would work in C or C++ or something like this.
00:21:47
Speaker
have the ability to opt in to a fully static world. And so again, we work with CPU and GPU high performance numeric programmers. They never want anything dynamic. They want full control over the machine. They want full control over very fiddly low level optimization things. And so you don't want these worlds where it's like, hey, I'd write some dynamic code and we'll try to de-virtualize it.
00:22:08
Speaker
that community wants, it's like the rust ideal in many ways. It's like, I want full control. I want predictability. I want you out of the way. And so we can provide that. And then if you start removing types, you can get fully dynamic if you'd like. Okay. So if I say that something is a
00:22:26
Speaker
I'm thinking of pure script here where I can say this thing, which is just a foreign thing coming from the world of JavaScript, I declare it's a string and so now you're going to treat it like it's a string and it might be almost anything under the hood.
00:22:41
Speaker
Yeah, so we are not in the mode of, so there's a whole class of languages, including TypeScript and even Python itself, where it's saying the fundamental nature of the universe is untyped, but you can provide type hints. And those type hints can be used in for error messages or other things like this. We're not that, that is not what we do. Okay, so we have types and we have
00:23:06
Speaker
Python object. And if you want to go from Python object to an int, you do a conversion.
00:23:11
Speaker
Okay. So there is a, it's not that conversion is checked. And so there's no, like, trust me, it's going to be okay. Typing it's, it's actually typed like, and, and this, this is actually very important. And so, and then the nice thing about this is it all composes correctly through the type system, which is the big thing. Um, if you dig into, so that that's the very top level, how do we handle what I would call progressive typing? Maybe it's the wrong word. I don't know. Um, again, I'm, I'm still new to languages and some work on this stuff, but the, um,
00:23:42
Speaker
But within the stack type system, then you can say, OK, well, how does it work? Well, it's both very boring in some ways, but it's also, I think, very boundary pushing in others. And so it's boring in that we learn category theory. We understand how traits and Rust and protocols and Swift and all these things work. And so we're just taking modern, statically typed, generic, higher order, functional type system like many modern languages have and bringing that in.
00:24:08
Speaker
I don't think it's nice and it's important to get that right, but it's pretty familiar to people and I think familiarity is good. The other side of it though is that we are very hardcore about pushing things into the library.
00:24:24
Speaker
And so if you take a look at Rust, for example, but in this case, I'll pick on C++ because it's an easier victim. C++, it's a very powerful language. You can build high quality libraries in C++. But because of its historical roots, it burns a whole bunch of weird things into the language. And so for example, complex numbers are STD complex, a template. But float and double are built into the language.
00:24:50
Speaker
And so there's certain things, certain conversions, certain weird things that only work with built-ins that can't be exposed out to libraries. So for example, you can't overload operator dot in C++. You can't build a smart reference. I think that's one thing that's been driving Bjarne nuts for decades now. And so in Mojo, we take a very aggressive approach on this, which is say, push everything we can into libraries and push the magic out of the compiler.
00:25:15
Speaker
And so int is not built into the language. Float is not built in the language. These things are all just libraries, and so they're just structs. And so the way this works is that the type system composes, and we use it for all the built-in things. And yes, stragment array and all these things, of course, are also written in the library. But what that forces, that forces the language to provide only the core essentials for expressing libraries.
00:25:43
Speaker
And we want everything to be ergonomic and very dynamic and flexible and powerful for the Python folks, et cetera. And so by doing this, we, you know, the compiler language library divide is balanced a little bit differently than what you'd see in traditional languages. Right. Yeah. This is reminding me of when I first learned in Haskell that Booleans aren't that, yeah, they're not caught of the language. It's just they provide you the tools to define Booleans.
00:26:09
Speaker
Yeah, otherwise it's just an alias for true. OK, so how does that actually work? Because I see a tension there between all the main types are kind of user slash library space defined. And I want really, really high performing optimizations for an array of ints.
00:26:31
Speaker
Yep. Well, so on one side, old tricks are the best tricks. And so Mojo believes in zero cost abstractions, for example. And so zero cost abstractions are seen in C++ and Rust and many, many languages. And so nothing wildly innovative. The way we do it is pretty innovative because we have this fancy MLR compilery stuff behind the scenes. But the concept is the same. But what that does is that allows you to stack up turtles.
00:27:01
Speaker
Okay, so you can have a lot of turtles and we're good. We love abstractions and so we can stack up turtles like anybody. But you have to have something where you bottom out. Like there has to be something underneath that bottom of the turtle, right? And so Mojo by design, it takes a lesson from Swift and then takes it 10X, which is in Swift. I sometimes used to joke that Swift was a syntactic sugar for LLVM.
00:27:26
Speaker
Yeah. So Swift int was also defined in the library. Swift int was also a wrapper for an LLVM i32 or i64, the underlying IR thing. And the way that worked is that Swift at the very bottom could talk directly to LLVM primitives.
00:27:41
Speaker
And so what you're doing is you're building syntactic sugar, you're defining types and therefore overloading operators on and doing all this stuff to build up these core operations. Mojo does basically that same trick, but it supercharges it by moving to this MLIR world. And so MLIR, being a much more modern compiler stack, has way more powerful features.
00:28:02
Speaker
we can expose things like, you know, float four and like these really weird numerics you see on accelerators. We can do all this like very, again, we're talking to accelerators with tile matmoles like the Tensor Core on GPU and things like this. And so we can talk to all these really exotic
00:28:19
Speaker
features, and then wrap them in really nice libraries. And so you get the benefit of the power of the hardware and direct low-level access to very crazy exotic things. But then you have syntactic sugar built in libraries, and now you can extend the system, extend the language without having to be a compiler nerd, which I think is very, very important.
00:28:38
Speaker
How does that actually look? That sounds fascinating, but I'm wondering, let's say I decide that I'm going to write a complex number library for Mojo.
00:28:51
Speaker
It's simple and because I'm writing it basically in Python. Yeah, you have a struct with two floats in it, and then you go and overload the underbar, underbar, add method, which is the Python way of overloading the operator plus. But at some point, am I going to say, oh, I can tweak this if I just mix in a bit of MLIR code?
00:29:12
Speaker
Yeah, if you'd like to. So for example, use complex numbers. Complex numbers are hardware accelerated on certain chips.
00:29:23
Speaker
just like add of two complex numbers because you're just adding two floats. But if you do like a multiply accumulate, or if you do like a complex multiply includes, I mean, my math is rusty, but it's like four multiplies and a couple of ads or something like that. There are these certain CPUs have operations that just like do that in one shot. And so you could say, okay, well,
00:29:45
Speaker
Effectively, like in IFTTT, we have a compile time metaprogramming system that you'd use to do this instead, but effectively like in IFTTT, if I'm on this system, I have this feature, go do the optimized thing, else just do the generic thing, and then you have full access. So I can do opt-in, as a library writer, I can do opt-in hardware level acceleration. That's right.
00:30:09
Speaker
Yep. Yep. And still have a nice fallback. Okay, that's fun. And your clients don't have to worry about it. That's actually a really powerful thing. And would I be able to say things like, if I'm on this architecture, do this and if I'm that architecture, do that. If all else fails, just use Python stuff.
00:30:27
Speaker
Yeah. And again, Mojo has superpowers because of the domain it's in. So you can write one piece of source code and that source code runs partially on your CPU, partially on your GPU. And these may have different pointer sizes and may have different numeric and capabilities. And the way the whole stack works is it supports this code sizing thing. And that's, you know, there's these very deep, fundamental, very nerdy compiler things that enable things to just work in a way that people aren't quite used to.
00:30:53
Speaker
Okay, this is fun because this means like I potentially you could see a future where someone writes a really useful library But it's not fast enough and someone just PRs in the harbor optimization without changing the programming language Yeah, well and also we've so again mojo designed in 2022 instead of
00:31:15
Speaker
In 1980, we have things like SIMD. And so SIMD vectors are a thing. All computers have them these days. We have direct support for that, direct support for explicit vectorization. So you get full access to the hardware. And so it's been really fun seeing Mojo developers worldwide, where they just take something like Game of Life or whatever, and they just say, OK, I'll start using this. I'll try this. Oh, hey, wow, it's 1,000 times faster than the code I started with.
00:31:40
Speaker
That's cool. Because again, with library-based extensibility or vectorized function, it's just a library function. And so it's like, OK, well, I will vectorize this code by using a combinator from the library and just going to build into this one step at a time. And I've seen tons of people go through this growth path where they're like, oh, wow, this is really cool. And oh, wow, I'm having fun. Oh, wow, I'm building something interesting. Oh, I'm learning something. And this is where I think people love going through the growth path.
00:32:05
Speaker
Okay. Well, in that case, we have to get down now into CPUs and GPUs properly, I think. So what are these theories that you proved out in MLIR space that make Mojo compelling to you?
00:32:23
Speaker
Well, so we probably shouldn't dive deep into that because we can say in the language stuff, but I'll give you some intuition. So in the AI space, I'm in love with AI for both the user applications, which is what most people talk about, but also all the systems and all the technology that got built to support this. And to tell you one thing that I find fascinating is that
00:32:45
Speaker
So I worked on the TPU project at Google, and so helped bring up these massive data center accelerators with exaFLOP computers. They're supercomputers with thousands of nodes and all this stuff. And one of the things I found just super inspiring is that we have today, you can sign up for a Jupyter Notebook on Google Cloud and get access to one of these things. And with a few lines of code, you can now be programming an exaFLOP supercomputer.
00:33:11
Speaker
And you're describing a novel computation and it gets mapped, partitioned, scaled out across thousands of chips, run at massive data center speed. And this is what HPC people have been doing for a long time. But now you have AI researchers doing this.
00:33:28
Speaker
They're not writing MPI code or low-level high-performance stuff. What made that possible was a shift from completely imperative programming to declarative programming.
00:33:44
Speaker
And the way this works is that in AI, you build a machine learning graph. I mean, there's many variants, but for example, you build a machine learning graph and you have the AI researchers thinking about the level of, hey, I'm going to have a matrix multiplication. I'm going to have a convolution. I'm going to have a gather or a reduction or whatever. And so I'm thinking about
00:34:02
Speaker
It's almost like APL, again, bringing it back to language nerdery, right? And so they think about simple compositions of these highly parallel operators. And then what you do is you give this graph to a very fancy compiler stack, which then takes this and does not just picking out functions that implement each of the operators, but it's actually doing fusion of the loopness.
00:34:29
Speaker
And so you're taking this very complicated math, doing very high-tech compiler transformations, and then also dealing with distribution across clusters, and these things you can do because it's a declarative specification.
00:34:44
Speaker
If you try to take a pile of C++ code and parallelize it, well, good luck with that. That's not a thing. But if you say, hey, run four copies of this across four machines, well, that's easy to do, relatively speaking. And so what this whole stack evolves into, in the case of Mojo, is we have
00:35:05
Speaker
what's called the max engine. The max engine is a very fancy AI compiler stack. It's kind of like an XLA, but after learning a lot of lessons, if you're familiar with these things, and so it can run machine learning graphs, but we want it to be able to talk to the imperative code.
00:35:23
Speaker
And so you need to be able to write custom operators. You need to be able to invent new algorithms. If modular doesn't know what an FFT is, but you do, and that's really important to your signal processing domain, then you need to be able to provide an FFT. But we want the stack to be completely extensible. And so what Mojo enables people to do is you can write an algorithm, very simple, familiar code. It's all good. You can understand it because you're just writing source code.
00:35:49
Speaker
In contrast, CUDA is kind of its own little world. It's very not Python, right? And so instead of writing a CUDA kernel, you can write some mojo code. And then the graph compiler and the other stuff can suck this up. Because of MLIR, we can now reflect onto it, and we can see what the code is doing.
00:36:05
Speaker
And so that allows us to take it, do these fancy compiler fusion things, and do the placement, do all this stuff. And that's something the world doesn't have. Because in the AI space, the state of the art technologies, I mean, there's a lot of stuff out there that's very crowded space, but the state of the art technologies are built around CUDA. And they're built around math libraries like Intel, MKL, and things like this. And these operators are all black boxes.
00:36:32
Speaker
And so there exists these fancy graph compiler things, but they don't actually have the ability to see into what the logic is that they're orchestrating. And so they can't do these high level transformations and it just is very janky in various ways. And so, again, our mission is to solve this, take all these systems
00:36:52
Speaker
a major step forward. And so this is what drives, you know, Mojo needs to have extremely high performance, right? Because we want to push state of the art on performance versus vendor libraries that do matrix multiplications, for example. It also is why we care about usability, because we have people that are using the same thing to build graphs and they're used to Python. And so we want to meet people where they are, you know, they're used to PyTorch or something like this. And so, you know, it kind of all flows together. And that's, that's where we're coming from.
00:37:19
Speaker
Okay, so there's a hard divide between the world of building up this data graph and creating custom nodes that you would insert into that graph. And then to make it even more funny, I don't know how deep you are in the AI space.
00:37:38
Speaker
It's so funny because the traditional TensorFlow PyTorch things were designed eight or 10 years ago, depending on how you count. They're coming from a research world. They're coming from an AI training world. But today, roughly everything, not everything, but a huge amount of the focus of the AI industry has shifted to deployment. And so when you get into deployment mode and you're shipping and running an AI model on a server or something, you don't really want Python in production.
00:38:06
Speaker
And so it turns out there's some challenges. I mean, it can be done, but there's some challenges with that. And so what we've entered into this world is we have researchers who love Python and live and die and breathe Python. And it's great for their use case. But then you have production people that have to rewrite these models and rewrite the tokenization logic for your LLM in C++ or Rust to be able to ship something.
00:38:32
Speaker
And so a big part of what Mojo is actually about is by solving that problem, by having one language that can scale, we can heal this divide between all the personas that are building these systems, whether they're high performance, numerics people, whether they're deployment engineers, whether they're AI researchers and get just get everybody to be able to talk to each other because they're literally speaking different languages and they, they, and you're massively impacting AI, getting into production.
00:39:07
Speaker
Well, it's an evolution of a lot of very good, very well-considered systems that were locally developed, aggregated, and then hill-climbed like crazy. AI has changed a lot in the last five to eight years, and nobody's had a chance to go back and first principle some of the technology. You look at that, eight years in computing is nothing.
00:39:22
Speaker
I had no idea the AI world was divided like that into different hats.
00:39:31
Speaker
That is nothing. And so all this stuff grew really quickly. And so this is where you say building a programming language is insane. Well, I mean, this is my words, not yours. You're too polite to say that. But I don't think it's insane. It's just a multi-year project. And so you can say, OK, well,
00:39:52
Speaker
What are the benefits and the costs and benefits of doing that? Well, you have to be very practical about this and you have to make sure to not sign up for something you can't deliver on.
00:40:02
Speaker
If the results are worth it, it's a big bet that lots of other people aren't willing to take for a wide variety of reasons. You have to be right, but if you're right, then it's actually a really good contribution to the world. Yeah, I think if you've got a good enough reason to create a language, then I don't think it's insane at all. If you have no reason to create a language, but just fancy as a hobby project, then it's insane. Oh, that's also cool. I love that too. But there's something in that middle space which is definitely kind of crazy.
00:40:31
Speaker
Okay. Um, I have to ask, um, cause I don't know much about this world, but, um, if you're writing one of these operators within, uh, machine learning graph that you've described, what's that like in Python-esque Mojo programmer space? Cause it feels like it's going to be something like, um, uh, pipelines, like I'm going to yield a value and a weight of value.
00:40:59
Speaker
Also, it depends on what kinds of code you're building, but the analogy is it's kind of like writing C++ or Rust code, but with Python syntax. You get rid of all the templates, you get rid of all the line noise.
00:41:17
Speaker
But you're writing for loops. And so Mojo, because it provides higher level, higher functions and I mean, combinators that you build in the library, you're often composing together, hey, parallelize this region, vectorize this region. And so the code style looks a little bit different than maybe literally just writing for loops. But Mojo is disarming to people because people are taught, for example, never write a for loop in Python.
00:41:44
Speaker
Python is slow. Never write a four-loop in Python, right? And so in Mojo, all that wisdom is invalid because it has none of the DNA of Python. It's not the same implementation. And so a lot of these things that people knew to be false are actually totally fine. It's also super funny. We have folks that are writing
00:42:07
Speaker
high-performance Tensor Core stuff and VNNI low-level stuff. And they're really experts in low-level system architecture. And they're like, it's so weird to be writing assembly code in Python. And so it does twist your brain or open your eyes or shift your perspective, however you want to look at that. But otherwise, it's familiar. And again, a lot of what we're going for isn't driven by novelty for novelty's sake. It's about pragmatism.
00:42:38
Speaker
One thing I'd love to talk about is the compile time metaprogramming piece of this as well. Oh yes, okay. Well, why don't we go there now? I've got a couple of other questions for you, but yeah. Well, so I mean, this is another big bet that we've made. And so in these domains, both in the AI world, where you're building models, models and source code are effectively a metaprogram.
00:42:59
Speaker
They're a bunch of imperative logic, and then they describe roughly a graph. And that graph is the thing that then you distribute and transform and do whatever. And so Python has long been used for metaprogramming for a wide variety of different domains. But that's one of the reasons it's been very successful in the AI community.
00:43:17
Speaker
If you look at the new, again, the other world, the high-performance numerics, people often use C++, and they use templates, and you're metaprogramming these things because you want an algorithm that works on both floating point and on double. Float32 and float64, right? Yeah, yeah.
00:43:34
Speaker
And of course, then it turns in this massive cataclysm of templates and you get right. Also more modern languages, for example, I know, you know, Zig, for example, has said, okay, well, let's not have.
00:43:48
Speaker
a different meta language than a language. Yes. Right. Let's actually use the same language for the meta programming as for the programming. And so again, in the case of Mojo, we're learning from the good work of many other domains. And so we said, OK, that's actually a really fantastic idea. Python is highly dynamic. You can overload operators. You can do all these things dynamically. We can't pay the expense. We can't have even a single clock cycle extra in our domain. We have to have bare metal
00:44:16
Speaker
bare metal performance, but we want the benefit of the abstractions and the extensibility that Python provides. And so what we do is we say, okay, let's take the benefit of the Python, let's do everything dynamic, let's take comp time metaprogramming, let's fuse these things. And this is one of the major ingredients that allows Mojo to be extremely expressive, because you can build just like in Zig, but I mean, we have a slightly different take, but the same idea as Zig, you can build
00:44:41
Speaker
the standard runtime algorithms. You can allocate heap data structures. You can do all this stuff and then use it at compile time. And so you get this composition that enables really expressive libraries. And nobody likes C++ templates, right? And stuff like this. And so you get the benefit of building these things, which gives you the ability to build these combinators and these higher level functions and features and compose the benefit of this compiler world plus the runtime world.
00:45:10
Speaker
OK, so to be clear, are we saying that you're introducing a thing where I can write, it looks like I'm writing Python at compile time to generate Python to then compile and run? Yes. Or another way to say it is you have values and objects and functions and features and classes and types and things like this, and you can use them either at compile time or runtime.
00:45:34
Speaker
Okay. But I can, I can start constructing my own abstract syntax trees at compile time. Um, we, we haven't gotten that far, but in principle we could support that, but it's more of like, and a simple example is you say, okay, well I have a, um,
00:45:50
Speaker
I have a function that creates a lookup table. And so it's a normal function. You can call it at runtime. You can pass dynamic values and it's the arguments. Cool. Give me a ST vector. We call it a list. Give me a dynamic collection of values that are populated through whatever crazy math you want to do. And now you say, OK, cool.
00:46:14
Speaker
That lookup table is static, and the inputs to the table are always static. Just go call that at runtime and just give me the table. And so what it does is it says, OK, just at compile time, go run that function, calculate the dynamic data structure, do all the logic that you're doing. The output of that is an object. It's a list. Burn that list into the executable. And now you have an object, and you just have the object directly instead of having to compute it.
00:46:37
Speaker
Now, and that's a simple example. There's many fancier examples because when you do this, suddenly as a type system nerd, maybe you'd appreciate that, you know, like types are just values, right? And so you can, and so your types are just compile time values. And so you can do much more fancy, higher level programming at compile time with using types as values and things like this. And so there's,
00:47:04
Speaker
There's a whole rabbit hole there. But the cool thing about it is that it comes back to enabling library developers to make demand-specific abstractions and build things that allow modeling their world very clearly. OK. I remember talking with Loris Crow about Zig, how we were saying that that would be a way in to take a generic list, say, compile time, make the optimized version. Yep, same idea.
00:47:32
Speaker
And of course, Zig has its own personality as well, and so it's a very low-level language. It issues syntax, sugar, and things like this. It's very different in certain ways than Mojo. Mojo wants to enable libraries and abstractions, and so that's its focus. But this use of CompTime and that idea is very common. We're very happy to learn and admit that we learn from other people. We invent everything.
00:47:56
Speaker
No, we all stand on the shoulders of giants. I'm sure they were inspired by Lisp, which goes all the way back to the start. Everybody's doomed to re-implement their own Lisp, right? Yeah, and that's not crazy. That's a good learning exercise. Maybe shipping it is not always a great idea.
00:48:18
Speaker
Okay, so we've actually not tackled this, but we've skirted around it the whole discussion. The moment you're going from Python into real low-level dealing with CPU stuff, surely the idea of memory management is going to leak in here.
00:48:40
Speaker
Tell me what you've done there. Yeah. So if you're dealing with Python object, so one of the ways that we can embrace the entire Python ecosystem is we just keep.
00:48:50
Speaker
the CPython object model. And so everything is just compatible. And so if you import Python, you get the traditional reference counted indirect object box thing. And that's that. That's cool. Have fun with that. If you enter in the world of I want to write real mojo code or native mojo code,
00:49:12
Speaker
And you get a very powerful type system. So at the bottom, you have types that can have move constructors, copy constructors, and destructors. And so you can write code that manually manages resources and do so directly. And that's one of the bottom foundational things. You can call into C. And so if you want to, you can call malloc and free and do stuff like that through unsafe hooks.
00:49:37
Speaker
But again, we want people to be able to compose together libraries and we want

Memory management innovations in Mojo

00:49:42
Speaker
to do something a safe way. And so what we have is we have references. References work very similarly to the ones in Rust.
00:49:48
Speaker
many implementation differences, but you can think of it that way. In Mojo, it's way less in your face and you don't have to micromanage the borrower checker quite as much, but it provides you the same approach and the ability to manage references. This is a really powerful thing and it's a very important thing, but again, I don't know if you want to nerd out on
00:50:10
Speaker
All the low-level things, probably not the right thing for the audience. But we've learned a lot from Rust. And so Rust is wonderful. They paved a lot of roads. And they've done a lot of really great work. But there are certain challenges with the bar checker. One example of that is that the way the bar checker works in Rust is that you have the parser. And the parser has a bunch of pretty complicated different rules and special cases of how it generates CIR.
00:50:36
Speaker
And then you have the bar checker and the bar checker comes along and tells you, hey, did you do it right or not? And if you did it wrong, then it tells you, oh, well, you know, maybe in the simple cases, it's easy to understand. But in the complicated cases, you're dealing with like the order of evaluation of how the parser did things that didn't do what you meant and all this kind of stuff.
00:50:55
Speaker
In Mojo, our equivalent is actually a very different thing. So we have the same thing where you have a parser and there's a set of rules. Our rules are very simple and predictable. Like I was talking about, we push a lot of complexity out of the language and push into the library. But then our borrower checker isn't just an enforcer. Our borrower checker decides what the lifetime of value is.
00:51:15
Speaker
And so a very big difference between Rust and Mojo is that in Rust, values are destroyed at the end of a scope. And so you can run to issues where you get exclusivity violations because something lives too long. And there's various solutions to improve this, like non-lexical lifetimes. There's a whole bunch of stuff going on in the community to try to improve this. In Mojo, the way it works is that a value is destroyed immediately after its last use.
00:51:41
Speaker
Okay. And so it's almost like you have an infinitely OCD garbage collector running. And so what this means is this means a couple of different things. This means, one, it's a much more friendly experience because your lifetime ends and therefore exclusivity violations get relaxed much earlier, just by default.
00:52:02
Speaker
It's better for memory use. So for example, if you're talking to a GPU, it could be that you have a tensor and a tensor is holding on to four gigabytes of data. And so that's actually a pretty important thing. It's better for little things like tail calls and other like core PL things, because if you have an object on your stack and it gets destroyed after your tail call, which is typically how
00:52:22
Speaker
destructors work, then you don't actually have a tail call. And so there's all these, again, there's this pile of very low level obscure details. Rust also has this thing called the drop check flag. And so they actually, in the worst case, dynamically track whether or not a slot on the stack is live or not. Because equals, in both Rust and Swift,
00:52:47
Speaker
can either be a reassignment over a value, in which case you're doing a mutation, or it can be the first initialization of a value. And this matters because you can, and Russ, for example, you can transfer out, you can move out a value from the slot and stuff like this. And so we solve that problem.
00:53:06
Speaker
And so we define away categorically and this leads to a lot of simplification of language, the program model, you retain the safety benefits and other things and the expressivity of references. Are you saying this is calculated at compile time? Because at the moment it sounds a bit like reference counting. Yes, it's calculated at compile time and uses lifetimes.
00:53:27
Speaker
The only thing I know about lifetimes in Rust is they're a pain to use, or at least I've never got my head around them. I think some people love them. Some people struggle with them. Again, it's like many of these things were, you know, I used to work with Graydon and I know that many people in the Rust community, Rust has been built on top of the LVM forever. And so I know the community is for a long, long time and have a lot of respect for it. But
00:53:51
Speaker
but also like Russ is 14 years old-ish, right? And it's roughly the same age as Swift. And so like we've learned a lot from that journey and it's not dead or anything. Like obviously Russ is a wonderful language with an amazing community, but what Mojo represents is an opportunity to take the learning and do something that's the next step, right? And there's a bunch of ways to simplify it. Yeah, Russ is groundbreaking in that sense and it's inevitable you'll learn things from that.
00:54:16
Speaker
Yeah, and so what we're doing is we're saying, OK, cool, let's take that forward. As was Swift, learned and made many mistakes in Swift and many other systems as well. And so, yes, we're pulling that forward. There's a whole bunch of cool stuff in there. So Mojo supports async-await natively, because that's obviously important for high-performance threaded applications. We don't need pinning. And so that's a really big deal, it turns out. And it's because all values have identity.
00:54:46
Speaker
Again, there's these very simple, very fundamental, very low-level nerdy tweaks to the way the type system works. In practice, if you have a Rust program, it will be doing tons of mem copies. There's optimizations to get rid of the mem copies, and sometimes they work, sometimes they don't.
00:55:04
Speaker
but in mojo you just by the way the whole system composes out you never get mem copies never get implicit mem copies because of moves and so there's just like a bunch of these very low level how the language and compiler implemented kinds of things that work well together. Okay that's making me wonder you said identity is making me wonder if you've taken a view on things like immutable data.
00:55:27
Speaker
Well, so we are still rapidly, like, actively debating some of these things. My experience from Swift, so if I recall, you said you don't know much about the Swift ecosystem? No, no, I left before it happened, really. So, I mean, in your infant spare time, you should check it out. It has some cool things. One of the things that
00:55:50
Speaker
We pushed, and Swift has, again, it's not like a brand new language, but it's a pretty modern language. It was built in the last... It started in 2010, I think. It pushes forward functional programming, and it made the observation that... I'll say some things, I'm sure some of your viewers
00:56:12
Speaker
You want to kill me. But functional programmers, who I love, by the way, will say functional programming is amazing because you never mutate data. You always get new values. And because you get new values, you get composition, you get predictability, you get control, you get all these different benefits of not having mutation.
00:56:33
Speaker
Now, C++ programs would flip that around and say, yeah, but creating a new value every time you want to insert something into a list. Creating a new list is really bad for the machine. It's very bad for performance. Nobody can ever build a real system on top of that, depending on how aggressive they want to get. I feel like I'm about to join a birth fight. Carry on. Yeah, whether it's friendly thing over beers or whatever, at least we're not talking about Emacs versus Vim or something like truly controversial. Something really controversial.
00:56:58
Speaker
Well, what Swift does is it says, actually, the thing that you want is exclusive ownership of a value. And if you have exclusive ownership of a value, you can get value semantics. And value semantics in Swift admit local mutation. And Rust has its own take on the same idea. But the idea is if you have exclusive access to a value, you can mutate it. And so now you can have in Swift the array, the dictionary,
00:57:25
Speaker
the string types are all immutable in the Java sense, where it's like, if I have a string, I know that its value will never change underneath me. And so it looks very, very much like a functional programming idiom. And if I have a string, it can't change unless I change it. And if I change it, it's cool. It's not going to break anybody else. And through the implementation, it never does deep copies or
00:57:46
Speaker
or stuff like that implicitly. And so there's a bunch of stuff that was developed and works really well in the Swift ecosystem that I think will come over naturally into the Mojo ecosystem. And we're still working through that. Okay. Yeah. Okay. That makes sense to me without getting into a bar fight. That's it.
00:58:03
Speaker
But the goal is, again, the goal is bring forward the wonderful things of functional programming. Composition, locality of reference. You don't have the spooky action at a distance thing. Yeah, that's one of the last things. So this is all what I love about the functional programming model, but then also bring in-place mutation so you get efficiency. And so bringing those together is good. Swift has some problems that implicitly copy things a million times and stuff like this. And so we fixed some of those problems. You've learned from that.
00:58:32
Speaker
But that seems to me like it would, like, ensuring you have exclusive access to a value, seems like it must leak into programmer space in a way that will be unfamiliar to Python programmers.
00:58:46
Speaker
But recall, this is all opt-in. And so if you want to use fully dynamic stuff, you can totally do that. And that's totally fine. And so this is one system that can scale because we're not trying to change the existing world. What we're trying to do is fill in the missing world. And so if you look at a modern, one way to look at modern Python is that Python is only half the language. You have Python and then you have C.
00:59:15
Speaker
Right. Right. If you're building a large scale application Python, you end up having or C++ or Rust or something else that goes with Python. Right. And so what we're doing is we're keeping the Python.
00:59:27
Speaker
at least keeping the syntax, but then replacing the C and having one system that can do both. And so instead of having a switch from, I have Python and it uses underbar and bar add to I have C and C++ and FFI and bindings and all that nonsense, you say, okay, well I have underbar add and it works the same way and everything just comes across.
00:59:49
Speaker
Right. Right. I'm with you. I'm with you. In that case, there's one other, I think there's one other big topic in Mojo we haven't touched on at all, which is in contrast to Python. And I'm reminded of it because of spooky action at a distance. Threads, parallelization. That's where
01:00:08
Speaker
Yeah, so super first class in Mojo. Every modern machine is multi-core. So keep going with this question. So Python famously has a global interpreter lock. Meanwhile, the moment you allow threads, you get bitten by spooky action at a distance. Something changes under your feed. I'm glad to hear Mojo has ways to deal with that. But how do I get into parallelization in Mojo? How do I write a parallel program?
01:00:39
Speaker
Also, again, it pushes things in libraries, and so you can have... We have parallel for loops and stuff like that, and they're just library functions, and so you can pass a nested function into parallel for loop, and so that's the easiest way. We actually have a very high-performance, low-level threading library, because today's systems are not just four or eight cores, they're servers with 256 cores. Crazy numbers.
01:01:03
Speaker
Yeah, and it's only gonna get more crazy if you go for a year or two or five or 10, right? It's just gonna be nuts. And so this is the world that Mojo's designed for. Now, one of the things we haven't built out, we'll see, I'm not committing to this, don't hold me to this, but we haven't built out an actor system for Mojo yet.
01:01:20
Speaker
Okay. In the case of Swift, we fully did. And Swift has a full actor system, which is type safe. It is very good for large scale, loosely coupled distributed agents, even sports distributed actors, if you're familiar with actor systems. I didn't know Swift did that.
01:01:37
Speaker
Yep. And so that worked out really well. It builds right on top of async-await in a very nice way. And so we may do that. I have no idea. Right now we're very focused on the structured compute, more supercomputer style, and the numerics side of things. And so I suspect we will find ourselves wanting to do an actual system someday, but we haven't prioritized that yet. Would you possibly expect someone to do it just as a library?
01:02:07
Speaker
Yeah, I mean, so that's also a thing. The Scholar World built out Akka, I think, and has done really great things. The only downside about that, and so, I mean, maybe it would be great, and I would prefer it to be in the library if we can, is just make sure that it's memory safe. Because Akka, for example, is not memory safe, and that leads to certain challenges.
01:02:27
Speaker
On the other hand, it can be very pragmatic to say, well, do one step at a time and he'll climb and then get implementation experience with libraries. And then if there's a benefit to putting something in the compiler to mediate accesses across actors, then we can add a little bit of type system support for that and then get the bulk of it in the library with a little bit of a type system.
01:02:46
Speaker
Yeah, the separation of where it should be. Okay, so my options then for parallelizing and mojo are broadly, you said I've got async await, I can create a thread to run on, and you've got certain parallelized primitives. Yep, yep. And if you want to, you can call arbitrary C code and you can go completely nuts and do whatever you want to do, right?
01:03:06
Speaker
But we're encouraging people to not have to do that. It's very nice to just say, do this thing in parallel. Again, it's still imperative code, but it feels more declarative. And I think that's, again, we're raising abstraction levels so that people get out of the muck a little bit.
01:03:21
Speaker
Okay, that gives me plenty of parallelization things to play with. The question is, it comes up again, as soon as you're doing anything fancy with CPUs, that different CPUs support different things and sticking a threading model on a single core is going to be different to 256 cores. So how much does CPU architecture matter to a Mojo programmer? How much does it leak in? How much control do I have?
01:03:50
Speaker
Yeah, so I think there's a couple of different things. There's how much...
01:03:53
Speaker
Do you have to care about? And then there's how much do you get to care about if you want to? And so generally, my view is that there's a huge range of different kinds of programmers. They have different care abouts. And so most programmers just want to say, here's a parallel for loop. Go nuts. Then you're fine. And so there, from the systems level, you want to be able to support structured, nested parallelism. You need thread libraries to compose. You need async awaits.
01:04:21
Speaker
getting bogged down with tens of thousands or hundreds of thousands of threads that then kill your machine, stuff like this. But I think that's pretty simple. The cool thing is when you start getting into more complicated accelerators and

Tile-based computation across hardware

01:04:32
Speaker
things like this. And so if you think about CPUs have 256 cores, GPUs have thousands of cores or thousands of threads. And the programming model around a GPU is extremely different than
01:04:46
Speaker
The traditional CPU programming model and so one of our goals is to make it so people can write much more portable. Algorithms and applications and coming back to this two level idea the graph level is pretty easy and it's actually really hard but it's easy to understand how you make a graph portable.
01:05:05
Speaker
You implement the graph for one thing. You implement the graph for another thing. And so you can have two different stacks that are optimized or designed for the different kinds of hardware. And the power of being declarative is that you're separating out a lot of the implementation concerns, which makes that possible. But then if you get down into writing for loops, for loops are different. For loops are imperative code. Imperative code is inherent. It is kind of the bottom of the stack. And so what we've done is we've
01:05:35
Speaker
carved out the ability for people to define their own abstractions in Mojo. And then just like we were talking about with complex numbers before, you can have a complex number. It's a very simple abstraction, but it's an abstraction and you can go put a hack in it that is target specific. When you start talking about accelerators, really what ends up mattering a lot is
01:05:57
Speaker
both parallelism but then also memory. And so how you use the memory hierarchy is the most important thing these days, particularly for GPUs and LLMs in this world that we inhabit. And so modern GPUs and CPUs have many level memory hierarchies and so you
01:06:16
Speaker
The way to think about a CPU is you've got a big vector register file. And that's kind of like your L0 cache. It's like your registers. And then you have an L1 cache, which is really fast and close to the CPU, and an L2 cache. L2 cache is sometimes shared with one or two cores. Then you have an L3 cache, and it's shared with all of the cores. And then you have main memory, and you go out. And the GPU has roughly the same idea. The details are very different, but it's roughly the same idea. And so if you're writing something like a matrix multiplication,
01:06:43
Speaker
inherent to getting high performance with matrix multiplication is not just doing a dot product either way, you have to process the workload in tiles. And so what we've seen is we've seen an emergence of various tile-based programming models where instead of encouraging developers to think about things, literally a for loop doing a load in a store and an add and a multiply, instead you're thinking about processing a tile at a time, what you do is you write the algorithm for a tile,
01:07:13
Speaker
And then you use higher level orchestration logic that then says, OK, on this device, I'll traverse this way, or I will prefetch the data in two steps ahead, or I will get better reuse if I go vertically and horizontally, or whatever it is. There's all these tricks that the world has developed. When we say tile, am I imagining, because we're talking about GPUs, am I imagining something that's eventually going to be printed as a square on the map in my game?
01:07:43
Speaker
Yeah, so if you're thinking about textures, so texture is a two-dimensional rectangle of data. So if you think about a texture map, so a two-dimensional rectangle, in AI you get a generalization of that called a tensor.
01:08:03
Speaker
And so you take a two-dimensional group of numbers and you make it an n-dimensional group of numbers. It's the same idea. But now just put this in your brain. I know you don't think about this every day. And so to me, it's fun to just kind of dive into this stuff. You talk about a two-dimensional array of data, even in the simple case. What really happens is it gets linearized in memory.
01:08:26
Speaker
Yes, that doesn't surprise me. And so when you go down a row, so if you go over one, you just add one to the pointer and to the next place in memory. But if you go down a row, you're adding a whole row's worth of data over. And so you're jumping. And so for our computer, it's very
01:08:46
Speaker
easy to access things that are very local, and if you start striding, it's a lot less efficient. Okay, and when that gets into multidimensional vectors, that becomes more pronounced. Exactly. And so now when you do things like matrix multiplication, the shape of a matrix multiplication, consider a two-dimensional matrix and a two-dimensional matrix, is typically you're going horizontally through the row of one matrix and you're going vertically through the row of another matrix to compute an output element.
01:09:16
Speaker
And if you could tell the GPU to arrange those two matrices differently. Yep. For example, you transpose it ahead of time. Things could be a lot more efficient. If you process, instead of processing one row and one column at a time, you can process two rows and one column. And then what that means is when you're processing column, you're accessing two elements next to each other, for example. And so as you generalize this out,
01:09:44
Speaker
you get this idea of a tile, and so you get this logical concept of I'm processing, for example, a two-dimensional block of memory, and I'm doing this and I'm composing against other things. Turns out modern hardware, not only is it complicated in vector and parallel and like all these other complexities we've been talking about, but they're now adding full-on matrix operations to the silicon. Right.
01:10:13
Speaker
And so you can literally do a matrix multiplication of a very small matrix, typically, you know, like a four by four, 16 by 16. Some accelerators, you can do 120 by 128. Boom, big operations. The intuition here is that AI is important to the world and silicon is fundamentally two dimensional.
01:10:32
Speaker
And so if you use the two-dimensional nature of silicon to put down a matrix multiplication, you can get a lot of performance and energy and other benefits from that. And so a lot of the challenge in this world is, how do I use these accelerators? How do I map these things and these tiles onto these devices? How do I use the memory hierarchy efficiently? And this becomes as important as the numerics, because the performance difference can be 10x or 100x, depending on what's going on. It's quite a big deal.
01:11:01
Speaker
And so if you go through the last 10, 15, 20 years, HPC has been dealing with a lot of these problems for many years. And we have the Fortran world. We have various C++ template libraries. We have a whole bunch of stuff that was developed and built to try to combat some of these problems. And they came up with fairly, I mean, sometimes very powerful, but fairly niche solutions. And the usability was never very great.
01:11:26
Speaker
And so this is where if you pull together Mojo's ability to talk to all these crazy hardware features like the matrix multiplication operations, build higher order combinators so that you can build libraries so that you can write the tile algorithm and not the orchestration logic because you don't want to know how that stuff works. You only want to know how part of it works. And then the compile time meta programming.
01:11:51
Speaker
that enables you to write really reusable and portable code. And so one of the things I recently gave a talk at the NVIDIA GTC conference, which is their big technical conference, and talked about how this all composes together to make it so you can write high performance numerics for the same algorithm works on GPUs and CPUs.
01:12:10
Speaker
Right. Yeah. And that's pretty cool, right? And it's not just about today thing of software, a GP and a CPU. It's about if you're making a major investment in building software, and that's how software works, like you want your software investment to last for 10 years or 20 years, you want
01:12:27
Speaker
the amount, a certain amount of being able to adapt to the needs of new hardware. And hardware will continue to evolve. It's moving faster than ever. And so what we're doing with Mojo is helping break through some of these boundaries that have prevented people from building portable software and while still being able to utilize the high performance, super fancy features that people are coming out with.
01:12:46
Speaker
Okay, so how is this going to work in practice? Let's say, um, I've written some code, which is fairly fast with wearing my Python hat. I've written some in Mojo. Invidia comes along with a new GPU with this great new CPU instruction. I wait. I'm assuming what happened is I wait for MLIR to support that CPU instruction. And then I just battle onto my Mojo library and write that.
01:13:14
Speaker
Yeah, so Nvidia is a great citizen in the software world because every time they come out with some new chip and one becomes available, they provide LLVM access to their stuff generally. And so we can talk directly into that stack. And generally hardware makers are very LLVM friendly these days, which is really cool. It helps enable innovation. So Mojo can talk to MLIRR and talk to LLVM and get direct access to all this stuff, which is cool.
01:13:40
Speaker
And also, let me zoom back out and say, this is deep nervery. This is something that a very small number of very important people care about. Most people will build on top of higher-level libraries. And so we have the Max Engine, for example, which gives you, just give me a graph. I don't want to know how any of that stuff works. Here's just a graph. Go to town, right? Or other people say, hey, I just got some Python code. Python code's slow.
01:14:05
Speaker
I'll put in some types, now it goes 100x or 1000x faster. And I'm not even doing fancy accelerator stuff, I'm just running on CPU, but 100x or 1000x is pretty material for, I have a big investment in Python code and I want to make it faster and I don't want to retrain all my engineers. And so that's why that's cool. Even if you don't go
01:14:25
Speaker
all the way down to the full craziness of what the system can do. Yeah. Where I'm sitting, I don't want to write like GPU level code. I just want my code to be fast. But when I find out that it's not fast, I want to be able to do that without someone telling me, oh, you picked the wrong language. That's exactly right. And I mean, this comes back to you. You're asking why pick Python as a Pythonic language, right? One of my one of the things I've learned is that
01:14:54
Speaker
How should I say this? Programmers, let me stereotype all of us. Let me make a sweeping generalization that's obviously doomed to fail. Programmers are busy people. They have things going on. Most people are not going to be just learning a new thing for the heck of it right now.
01:15:14
Speaker
Obviously there's exceptions, there's researchers, there's other people that just love learning and are passionate about things. But most people are busy, right? And so learning new things is not something people generally have the time to dedicate a week or a month to do. On the other hand, if you meet people where they are, you provide something familiar so they don't have to retrain from scratch to get up to a baseline, and you give them new tools,
01:15:40
Speaker
programmers that I've seen generally love growing. And so learning a new thing. Here's a new trick. I saw that with Swift. When Swift launched, we said to the Objective-C world, hey, you're all familiar with classes. Classes are great. They still work in Swift. Here's a new thing called a struct. It's got these different trade-offs. It's way more efficient. It doesn't use dynamic dispatch or whatever. And here's algebraic data types, enums. Wow, you have a payload around your pattern matching.
01:16:10
Speaker
Like this was a huge, huge aha moment for folks and people loved being able to learn incrementally from the base that they were on. And so what Mojo is really about is it's about meeting people where they are and then allowing them to grow in situ instead of saying you have to go off to the mountain and retrain in bootcamp for a month and then you can be basically effective in a new world. And so I think that's why people are excited about what's going on with Mojo.
01:16:35
Speaker
Yeah, I can see that. I always think the main reason I don't want to learn new things is I'm busy learning other new things. You can't get to the top of my stack so easily. Yeah. Well, I mean, it's a busy world out there and there's a lot going on. Yeah. Yeah. But

Mojo's community and open-source vision

01:16:49
Speaker
okay. So I actually do have some Python code at the moment. If I, if I thought, okay, well, let's give this a try and rewrite it in Mojo. What's my experience going to be like today? How mature is it? What's coming on the roadmap? Where are you going with this? Cool.
01:17:04
Speaker
Yeah, so Mojo is still a relatively young language. It's useful. It just launched last May, so it's been public for less than a year. But it's doing really well. I think we have over 175,000 people that have used Mojo. We have a nice Discord community that has over 22,000 people hanging out.
01:17:26
Speaker
all the magicians talking to each other, doing cool stuff. We have cool new demos and things. You've got the cute nickname, too. That's a very important step. Oh, I forgot the most important thing about Mojo, by the way. The most innovative thing is that we support an emoji file extension, in addition to .Mojo. Oh, God. So that also causes people's heads to explode. I'm not going to judge that. Innovation of all sorts.
01:17:54
Speaker
But the community is doing really well. It's very exciting. Actually, as we record today, we're open sourcing a big chunk of Mojo. And so the entire standard library, which as we were just talking about is the heart and soul of the language is all open sourcing. And so we've been on a quest to open source more and more and more of the stack over time. And so that's a really big deal.
01:18:12
Speaker
I know that we've been public about this and telling people about it for a while but i know people are waiting for this for a long time and so what we're saying is we're just saying it just a continued growth of the community and continued passion projects cool things going on and i think this will be a huge step and one of the things that's very important to me is we talked about size.
01:18:31
Speaker
built the LVM community from scratch from my research project at the university. I built the Swift open source community and worked in many, many, many other communities. And what I've seen is that open source isn't just about having code on GitHub. Open source is about having an open community, having an inclusive
01:18:50
Speaker
way of developing code together, working together with a common goal. And so we put a lot of energy into not just providing source code, but also getting a contribution model, picking the Apache two license and things like this so that people have patent coverage and like all the things that follow best practices. And so I'm really excited about that. I think that people are going to have a lot of fun and I look forward to being much more open with our development of Bojo.
01:19:15
Speaker
OK, that's cool. So if I do decide to do this, I'm going to be able to find lots of people to answer my dumb questions on Discord. Yeah. Yeah, yeah. So please join our Discord. That's a great place to go. And there's a whole bunch of everything from folks that are just interested in type theory, nerdery, to AI stuff, to I want a better Python. There's many different angles. And again, the cool thing about Mojo is that it is
01:19:41
Speaker
being built and it has to be state of the art to solve these pretty hardcore, pretty gnarly problems at the frontier of computer architecture and programming languages. But we're building in a way that it's completely general.
01:19:55
Speaker
And so we're focused on AI, and there's a lot of pain and suffering in the world of AI that we're helping to alleviate. But it turns out that a lot of people write web servers, and a lot of people do other things. And it's fantastic to see people building into that space, even though we personally don't have the expertise to invest in that.
01:20:13
Speaker
Yeah, well that's what a language is supposed to do, right? Enable other programmers. Yeah, yeah. That's exactly right. So concretely then, if I go and install Mojo and take my existing web server writing knowledge, am I going to be able to do like the equivalent of pip install the existing web server library that I like from Python world?

Bridging Python with Mojo and beyond

01:20:35
Speaker
Yeah, you can totally import and use it and it will just work. Literally, you don't have to write wrappers or anything, just import it and go. That would just work. The one problem is that if you take the Python source code and you move it to a .mojo file, you'll have to make changes right now.
01:20:50
Speaker
And the number one missing feature that we have right now is classes, which is a pretty big deal in Python. And so we have very strong structs and we have very strong static features. We have references and all these kinds of things. And so again, what Mojo has built out is all the C++ and Rust
01:21:09
Speaker
equivalent features, even then there's some minor things you're missing, but that's its strength. And so if you come at it from a systems programming world today, then I think you'll be very comfortable. If you come back in six months, then we'll have classes and we'll have other things built in, and then you'll be more comfortable as a Python programmer.
01:21:30
Speaker
I'm going to go and check it out and see how comfortable it makes me today. Very good. It's very fun. Obviously, I love this stuff. This is a passion of mine for years, but I love the community side of it. The thing I love about Swift, for example, is I still get people that stop me in the street and say,
01:21:50
Speaker
Wow, I recognize you. Thank you for helping drive this thing and make this happen. Because of you, I learned how to get into programming, and Objective-C was always too scary, and things like this. With Mojo, what I hope happens, and these things take a couple of years to play out, but what I hope happens is we get all these people that know Python, and they don't consider themselves to be real coders or something, but they know Python.
01:22:13
Speaker
And they can continue to grow because they're not faced with, are you a Python programmer or a C++ programmer? Whatever they think is a Rust programmer, this scary other threshold. And if we can get more people involved, be more inclusive to good ideas, what I think is we'll find computer science and these technologies can go even further and have bigger impact than they've had so far. And I love that. And I am a true believer in developers. By the way,
01:22:43
Speaker
You're a fan of your show, and so I did not pay Chris to say that. That's what I love. Yeah. Well, I really hope it succeeds, because anything that brings new ideas to programmers and helps unify our somewhat fragmented view of programming, I'm all for it. Cool. Chris Latner, thank you very much for joining me. Yeah, well, thank you for having me, Chris. It's great to be here. Cheers. See you again.
01:23:08
Speaker
Thank you, Chris. And I have to pick up on that last point. It makes me sad to think that there might be Python programmers out there who don't think they're real programmers. Of course you are. You absolutely are. If you were in the business of teaching computers to do things they couldn't do before, you're a real programmer. Don't let anyone tell you different. And if you're in the business of learning how to do that better and exploring new ways of doing it, then you're a friend of mine too.
01:23:36
Speaker
As usual, you'll find links to all we've discussed in the show notes. Check it out if you want to take a look at Mojo, if you want to kick the tires on it. I've been having a play and having some fun early days, but very, very promising. Before you go and do that, do take a moment to click like or share or rate or subscribe because I'd love the feedback. Of course I would. And the algorithm would love to let other like-minded people know that you've enjoyed this.
01:24:02
Speaker
And that means you and I and future listeners will get together more easily. Until that time, when there's a future episode, I think it's time for me to say goodbye. I've been your host, Chris Jenkins. This has been Developer Voices with Chris Latner. Thanks for listening.