Imposter Syndrome in Writing
00:00:00
Speaker
did Did you feel uncomfortable that you don't have a PhD in machine learning and deep learning? Did you say, should I be the one that writes the book? How did you deal with that? Because I always feel like an imposter when I'm writing technical books.
00:00:15
Speaker
I am blessed with a complete absence of imposter syndrome. Oh, wow. You're the first. That's amazing. Tell us your ways. Perhaps some people in my life would would would tell me that that a little bit of it might be good for me, but... Hey, friends, you probably knew that Text Control is a powerful library for document editing and PDF generation, but did you also know that they're a strong supporter of the developer community, and it's part of their mission to build and support a strong community by being present, by listening to users, and by sharing knowledge at conferences across Europe and the United States. If you're heading to a conference soon, maybe check if TextControl will be there. Stop by and say hi. You'll find their full conference calendar at textcontrol.com. That's T-E-X-T, control.com.
Introduction to Philip Kiley and 'Inference Engineering'
00:01:03
Speaker
Hey friends, I'm Scott Hanselman. This is another episode of Hansel Minutes. And today I'm chatting with Philip Kiley. He is the author of Inference Engineering and he works at Base 10. You can check it out at base10.co. How's it going, sir?
00:01:16
Speaker
I'm doing great, Scott. It's an honor to be here. Well, it's cool to hang out with you. so This book came out of nowhere and you you tweeted about it. I saw it. I saw the opening video and then you were kind enough to send me a copy, ah which I have read and dog-eared and I've got a bunch of dog-eared pages here with questions that I want to ask you about.
00:01:38
Speaker
But I got to just shout out, like i there's just so much kind of like slop right now. There's so much AI slop. And then a book shows up that kind of like is the opposite in the sense of like, it smells good. It's got a really nice design. Like, like I was like, the first thing I thought was like, I hope they didn't AI generate the design. And then you actually shout out the author in the book who did the design custom. Like it feels bespoke in a moment where everything's not bespoke.
00:02:06
Speaker
Was that a conscious decision to do something like that?
Writing Challenges and AI Role in Book Creation
00:02:09
Speaker
Absolutely. I mean, the thing is, Scott, I love writing books. i've done I've done it before, but I've never done a good job like this. So it took me a few tries to to get here. So I wrote Inference Engineering because, number one, I love the topic, and number two, I love writing books.
00:02:26
Speaker
And if I was, you know, just saying, I mean, you saw in the launch video, the the very first line of the video was, hey, you know, they made me take it out. But I was going to say, hey, Claude, write me a book. I just said, hey, hey, write me a book um called Inference Engineering. Make no mistakes.
00:02:43
Speaker
it It just doesn't work. You just can't do it. So I had to We got to say right off the bat, when you're writing a book about AI, you're writing a book about the technologies that power AI. The first question that an interviewer has to ask is, did you AI slop the book and and have the machine write the book for you? Or is this four years of actually writing hard?
00:03:06
Speaker
So I tried my best to AI slop this because i I came up with the idea for this book six months ago. And you know if you go talk to a traditional publisher, which I did, they'll tell you, once you give me a manuscript, it'll take 12 to 18 months to get this thing out into the world. And I knew that I wanted to go from idea to finished hardcovers all over the world in six months. And so anything that I could do to make that faster, I wanted to do.
00:03:37
Speaker
So 100%, I tried. i said, hey, you know hey can can you come up with the outline? Can you write this chapter, write that chapter? But the quality was just completely unusable.
00:03:47
Speaker
There were a few places where I was able to use AI. So number one, By the way, Scott, like, have you done any long form writing? I assume, I assume you have it. And you know, I've written a couple of books. I've got it. And and none of them used AI and some of them are almost a thousand pages long. And it's mostly, you know, it's code samples. Like, I mean, like this is the thing, like ironically, we're talking about a book about AI.
00:04:08
Speaker
It doesn't feel like there's any AI. i I read the book. I got the book marked up. None of my books are used use AI because it's just, it doesn't hold a through line. It doesn't hold a topic long enough.
00:04:20
Speaker
Exactly. So let me tell you how to use AI if you're writing a book, because, yeah, for the most part, for the most part, I didn't. If you look at, for example, chapter four, ah section, the figure 4.1, this guy right here, it's a six line CUDA kernel.
00:04:38
Speaker
Okay. Got it. I asked for an example CUDA kernel. That saved me 30 minutes. you know I see. So like code value some of the code might be. you know you can it's it's small It's small pieces like that that saved me 30 minutes there, an hour there, that you can verify yourself and and make sure they're good. It's, you know, like I said, you've you've written before, you know how hard it is to look at the blank page. It's coming up with a few bad ideas so that I can throw those away and be inspired to create a good one.
00:05:10
Speaker
It's taking your outline or taking your sections and checking them for factual accuracy, checking them for ah for completeness. But you can't just take the outputs and say, oh, well,
00:05:22
Speaker
every fact check it just gave me means all of this is wrong. No, that's just your to-do list to go do manually yourself. And then if you look in the back, like every time I write a book, I write custom software to make writing and laying out the book easier.
00:05:36
Speaker
So for example, like these QR codes and the layout of that. yeah, I didn't you know AI generate that because you can't, it's not gonna be good, but I had Koso write me a script to do that. um In Appendix A, it's alphabetized. you know I actually tried pasting the entirety of Appendix A into ChatGPT and I said, hey, alphabetize this and it it got it wrong.
00:06:02
Speaker
And then i said, hey, Koso, write me a script that will alphabetize this and it made 20 lines of Python and alphabetized it perfectly. So the the main way that I used AI to write this book was not the actual words, was not the diagrams. that The words are by me. The diagrams are by human artists and human designers. It's to write the code that makes everything else work faster in the writing, editing, and publishing process.
00:06:32
Speaker
and And I feel like you know in a moment where there's just so much crap being generated, I think that is a good example of how you want to use AI. I always use the word toil.
00:06:43
Speaker
If it's tedious and it's toil, then ai will do it. Like you say, generating QR codes or writing scripts to to alphabetize things. But if it's something like like the like the cover, like I immediately looked at the cover and I was like, this is really cool.
00:06:58
Speaker
And I wanted to see the human name of the person that wrote the cover. You know what i mean? And, this you know, it's like cover by Luke DeHaas, you know, and shout out to Luke. And I don't know. It's just like I want people want both.
Purpose and Audience of 'Inference Engineering'
00:07:11
Speaker
And you want the hard parts that humans work on to to feel hard. And I can feel that you are probably up late working on this book. Are you OK? Did you hurt yourself in the writing of the book?
00:07:23
Speaker
there were There were a lot of very long writing days. the The actual drafting piece was about a six-week sprint where four days a week I would wake up at five zero in the morning and write until I fell asleep, and then the other three days I'd go to work and do my actual job.
00:07:39
Speaker
So yeah, and and of course, shout out to Luke. um He's the one of the b brand designers that we have at Base Ten, and and he did a fantastic job on the cover and setting the design language for the interior.
00:07:51
Speaker
There's in chapter one, this little diagram of some football players. That's my favorite art in the entire book. Yeah, it was. I think this project was a great example of collaboration between multiple humans and between AI tools, because writing a book is toil. this To overuse a metaphor, it's like running a marathon. No individual step is that hard. The hard part is taking all the steps back to back.
00:08:17
Speaker
Not every piece of toil can be can be automated. For example, that book you're holding in your hand is signed. Do you know how long it takes to sign 300 hardcovers? Because I just found out. I did note your hands were a little shaky when you signed the book. My handwriting is just that bad, Scott. That's just how it is.
00:08:36
Speaker
The other thing that I think is worth noting, and I want to shout out like people who might buy the book or might think about getting into the digital download or a paper copy, is that this is a dense book. There's a lot of garbage out there that is just stuff you can learn online.
00:08:52
Speaker
What I appreciated about your effort was that I had to work pretty hard at this book. Like I had to, at points I had to stop, look stuff up.
00:09:03
Speaker
There were a couple of times where I did chat GPT voice mode and asked for, to understand stuff better. it is, it is dense. And it seems surprising to me that a company that makes AI easier would write a book about the hard stuff. How did you decide to write a hard book rather than to like make a puff piece?
00:09:24
Speaker
Well, for one thing to me, this is ah this is kind of not the hard stuff. The the hard stuff is what the model performance engineers are doing every day and the infrastructure engineers.
00:09:35
Speaker
you know I'm just i'm a friendly dev rel with an undergrad CS degree. No, I get what you mean. That's another reason that I couldn't use very much AI to write this is like mo the LLMs of today just don't know this material very well because it's so new.
00:09:53
Speaker
But my goal for this book, number one, was that anyone would be able to pick it up and read the first 20 pages and understand my vision for where the future of AI is going. Yeah.
00:10:05
Speaker
And then number two, if they you know make it through those 20 pages and they're really excited about the topic, that you know any smart person with a decent technical background is is going to be able to make it through the rest.
00:10:19
Speaker
And that meant making some trade-offs. you know I've definitely heard one of the, I think, most valid criticisms of this book is is that it's a mile wide and an inch deep. It's very much a survey of all of these technologies. It doesn't necessarily teach you how to go hands on and step by step do something. It's it's not one of your thousand page books with a ton of code samples and a ton of hands on guidance. And the reason I made that decision is partially just because I wanted a book that I could write very quickly. And partially because the goal of this book is more to teach someone what questions to ask rather than give them all of the answers. So yeah, you know, it it is a dense book. It's a hard book, but like,
00:11:03
Speaker
we have all of the salespeople when they join base 10, they, they read this and, and they get through it. So there's definitely, i think a level of accessibility to the content that I prioritized.
00:11:16
Speaker
Interesting. Yeah, I mean, I did. ah Survey seems light to say that it is a survey. i mean, it is it is an overview, surely, of everything that is going on right now at this moment in time. And I'm sure that in two years, you should probably start working on or maybe a year. You start working on the second edition. Absolutely. Right. Because that's the other irony of like having a physical book with actual pages and actual dead trees means that, you know, how much of this is out of date? Not a lot. It's actually pretty, pretty, pretty fresh.
00:11:48
Speaker
But um it did ask. it did give me a lot of questions to ask. And it's funny that you mentioned that because I'm looking here on my first question. I made it, truetraoo let me see here, 41 pages in before I started dog earring and looking stuff up.
00:12:02
Speaker
So you said people, if it if smart people can make it 20 pages in, that's a good sign. I made it 41. Well, there you go. 42 pages. so I feel but ah somewhat validated about this. did Did you feel uncomfortable that you don't have a PhD in machine learning and deep learning?
00:12:19
Speaker
Did you did you have you you say, like yeah should I be the one that writes the book? How did you deal with that? Because I always feel like an imposter when I'm writing technical books. I am blessed with a complete absence of imposter syndrome.
00:12:32
Speaker
Oh, wow. You're the first. That's amazing. Tell us your ways. Perhaps some people in my life would would would tell me that that a little bit of it might be good for me. But I think that there's a very academic take on AI that starts you know in the early days and defines what a perceptron is and teaches you what a convolutional neural network is and all that kind of stuff. And I entered the AI industry in 2022, January of 2022, about 10 months before ChatGPT became a huge thing.
00:13:08
Speaker
huh And I didn't know any of that stuff. And that hasn't stopped me from doing my job today. I'm sure all of that stuff is very valuable and I've enjoyed learning, you know, bits and pieces of it along the way.
00:13:22
Speaker
But I believe, and this is part of the thesis of of this project, that you can make a valuable contribution to the AI industry without a incredibly rigorous academic background. And that's You know, the the technologies of today are so new and moving so fast that in some ways it's an advantage to not know what you're doing because at least this way you don't have any bad habits to break.
00:13:47
Speaker
Yeah, that's interesting. It's funny that you mentioned that because i've I've talked about what it feels like to be towards the other side of your career at the end of your career. So one could argue that like you're in the opening 10 years of your career and I'm in the latter 10 of mine.
00:14:02
Speaker
So then should we, you know, we shouldn't gatekeep this stuff just because someone's old or someone's young or someone came in at the beginning of the hockey stick growth within inference or someone, ah you know, started it at MIT or worked on the project in the seventies with so-and-so.
00:14:20
Speaker
So you're right. Like young people with fresh ideas and fresh perspectives are what's going to actually do the work and make this thing, make the thing useful. So one of the questions I wanted to ask about usefulness is this talks about hardware, talks about software,
AI in Practical Tasks vs. Creativity
00:14:37
Speaker
it talks about, like I said, an overview of all the different GPUs, how they work, performance profiling. But underneath it is a reminder that like none of this is anything unless it actually does interesting and helpful work.
00:14:51
Speaker
do you Do you think about the what can be done with AI? Because we opened this podcast by saying it can't write a book. And I don't know if I want it to write a book. like I'm not interested in AI slop. I'm interested in AI, to our point, doing toil. So I'm curious about your personal opinion.
00:15:08
Speaker
like Here we are with a book on AI inference, but the the the book couldn't today be written by an AI and be any good. I think very little about vertical AI applications. Actually, i mostly spend my, cause I spend so much of my day taking other people's vertical applications and helping them figure out how to make it twice as fast and half as expensive.
00:15:31
Speaker
So I will say that, you know, as a software engineer off and on throughout my career, obviously the past couple months have been very exciting and i feel like I have regained the ability to ship production code because there's some, there's some trade-offs in, in coming into more of the content space coming into taking six months and writing a book and in that, you your other skills can, can atrophy a little bit. So I find that I've been able to,
00:16:11
Speaker
augment the sort of missing capabilities and ship the the stuff that I'm really excited about, no matter no matter where it is. and Another thing is you know video editing. I do a lot of video editing with Descript because I'm not a fast or skilled editor. If anyone had a chance to watch my college YouTube videos before I put took those down a few years ago, they they would certainly know that I'm no Casey Neistat.
00:16:38
Speaker
That's kind of the vision that I have for this space. It doesn't particularly matter what my vision is. It matters the vision of the people who are actually building the tools, not just making them faster. But my my vision and my hope is that I'm going to be able to keep doing the things that I really love and outsource the pieces that I am not so good at and create things that otherwise i just would have left on my to-do list.
00:17:04
Speaker
that That's a healthy attitude. I appreciate that. Like you're staying in your lane and you're appreciating, you might dip your toes in another lane, but you know what you're good at and you stay away from the stuff that you're not.
00:17:15
Speaker
So base 10 is about speed, right? Like it's about making it better. And a lot of people concern myself. I'm concerned about.
Optimizing AI Performance
00:17:23
Speaker
you know, uh, wasting energy on, on AI. And you spend a lot of time in the book talking about performance. There's a ton, like you, you start right off the bat about like, they're already making it faster. You talk about the difference between time to first token versus throughput.
00:17:37
Speaker
the the The whole thing is about throughput on it. Honestly, if there was like uh, ah a through line for the book. It's like, it's getting faster and it's going to get faster. How much faster do you think it can actually get? And it's not going to be good in the sense of we'll waste less energy and it won't cost this many teaspoons of water to ask a question of a, of a model.
00:18:01
Speaker
So with performance, there's a couple ways to think about it. It's a multi-variable optimization problem. But let's think just about three variables right now.
00:18:12
Speaker
Speed, cost, cost and throughput are the same thing, and quality, the the correctness of of the output. So quality is fixed at the model level. So let's take the model as fixed for a second. So now we only have to think about two variables, which are speed and cost or throughput.
00:18:32
Speaker
And you can plot an efficient frontier where on one axis you have speed, one axis you have cost. You can be maximally fast at a very high cost.
00:18:43
Speaker
You can be maximally cheap, but you're not going to be very fast. And then there's an efficient frontier curve between those two. Some model performance work is just figuring out where that efficient frontier is and helping you adjust the knobs and dials within your inference engine and whatnot to get to the place that you want to go on that frontier.
00:19:06
Speaker
And then some performance work more on the research side is pushing that frontier out. And when you do that, when you create some kind of optimization that unlocks better performance, you can choose to move a along move along the new frontier to either more speed at the same cost or or less cost at the same speed.
00:19:31
Speaker
So then you add back the quality access, suddenly your efficient frontier becomes a sphere instead of a curved line. And as you either fine tune small open source model to a place a large model, get a new performance frontier, find a spot along that, you you ultimately, like as this this research continues, you're able to push out and out your options around speed, cost and quality.
00:20:00
Speaker
again, kind of staying in my lane, like what I'm focused on is making the s sphere as as big as possible, making it so that you have as much sort of performance as a sort of a performance as a vague and term to spend in these trade-offs.
00:20:20
Speaker
Where does that actually go? You know, we definitely see with inference getting cheaper and inference getting faster that the same amount of traffic that you maybe had six months ago, you can now serve with a fraction of the hardware or you can now serve for a fraction of the cost. You can now serve at a much better speed.
00:20:42
Speaker
but that is overshadowed by increases in demand. It's funny because a lot of the time we spend as a company is like trying to figure out how to get our customers to pay us less because if you can make your inference X times faster, X times cheaper, you need fewer GPUs, there's less consumption. But that always gets rewarded with with more workloads and and more demand from the market.
00:21:11
Speaker
So at the at the sort of There's there's perhaps three layers to the answer to the question. There's the micro layer of on a workload by workload basis. Yes, we're making it faster and cheaper and less consumptive.
00:21:26
Speaker
And then at a macro level above that, maybe that increases demand. But at the macro level above that, well, the demand was there anyway. So eventually we're going to just be able to fulfill it and we better do so in the most efficient way possible.
00:21:42
Speaker
Yeah. you know, there's the old computer engineering joke, which is, you can have it good, fast, or cheap, pick two. There's always this idea that you can never have all three. But I think in the world of inference, in the world of AI, people expect and the market expects all three.
00:22:00
Speaker
And what I'm hearing you say is that you can optimize for two and then you'll bring that third one back up. So like there will be good and fast and then maybe it won't be cheap, but then it will become cheap. And we are consistently seeing people using like hard math and hard computer science to make it good, fast and cheap all at the same time.
00:22:19
Speaker
And then the the fourth one, and this is one that a lot of people don't really talk about, but it's actually very critical is reliable because you have obviously all of the questions around the model reliability and, you know, the prompting and the JSON mode output, structured output, function, color, whatever you're doing at the model level to make it reliable. Setting that aside, there's the infrastructure reliability piece. How many nines of uptime are you getting from GPUs, which notoriously are not a very high nine piece of hardware?
00:22:53
Speaker
You've got that. You've got the cloud provider reliability piece you have. the reliability of the inference optimizations that you're making when you quantize are you keeping quality intact is some novel speculation algorithm that you're introducing going to run into some niche bug and cause out of memory errors every so often there's all kinds of applied research in the ai space that in other industries might take years or decades to make it into production. And here we go from, you know, paper to running on the GPU serving massive production traffic in weeks in many cases.
00:23:32
Speaker
So there's a lot of work around like taking sort of research grade ideas and research grade software and hardening it to the point that you can serve a customer who has a four or five, nine of uptime SLA because they're doing something critical, like, you know, and an AI tool for doctors or some kind of, you know, financial tool.
00:23:55
Speaker
I want to shift gears a little bit because I'm thinking about, like yeah like you're saying, getting into production. And I'm seeing people call themselves prompt engineers a lot. And i that kind of bugs me. I don't really like that term because like there's a there's a feeling that you can really control the black box by just poking at it with pros.
AI Proofreading and Human Collaboration
00:24:14
Speaker
And I'm wondering if you can maybe juxtapose or explain the difference between prompt engineering and inference engineering. Absolutely. ah We are building the playground that the prompt engineers hang out in There is some overlap. So for example, at the inference engine level, you can guarantee certain structured output pieces. If you look at back in 2023, 2024, there was a big movement around JSON output from models and how do you get it to return an object? And there was for a long time, very prompt first view of this problem, which is like, you must return JSON and only JSON or my grandma's going to die.
00:25:01
Speaker
And that's just not something that you want to bet your production workload on. So now through the inference piece, we have the ability to guarantee a structured output. And the way that works is you define a schema, you send that in and along with your prompt and what the inference engine does is it creates a state machine and it uses something called logic biasing. So to dive into the the concepts of inference a little bit,
00:25:29
Speaker
And a large language model has something called a vocabulary, which is every single one of the 100,000 tokens it could possibly produce. And when it does a forward pass for inference to create that next token, what it actually does is it creates a vector of 100,000 probabilities of which token might be generated.
00:25:50
Speaker
With logic biasing, you do something called masking that vector. So basically, you take every token that would not be valid syntax according to the state machine for your JSON schema, and you just set the probability of that to zero or negative infinity, depending on how you're how you're doing it.
00:26:10
Speaker
And then you still let the model do its generative thing, but you structurally guarantee that no invalid tokens are going to be generated. And prompting, of course, in this case still matters because the contents of the JSON object that you're creating are not guaranteed by the schema. So it being you know accurate and quality data is is still up to you in the prompting side. But the actual structure of the output is now guaranteed by the inference engine.
00:26:43
Speaker
So that's an example of, I mean, that's been the standard in production now for 18 to 24 months, but like that's an example of where the inference side can come in and and help the prompt side. But there's also places the prompt side can come help the inference side.
00:26:59
Speaker
So one of the biggest optimizations we do is KV cache reuse, where we're able to take the first few tokens of a prompt and reuse them between different requests as long as they match perfectly.
00:27:12
Speaker
So for example, if you put your system prompt before your conversation, or if you put all your shared context up front, and then you change whatever's new about the specific request at the back of your prompt, your actual inference speed is going to be many times higher than if you put the novel stuff at the front of the prompt.
00:27:32
Speaker
So there's there's a lot of interaction actually between the prompt and the inference layer. So they you know they they have to work together to make a system completely optimized.
00:27:44
Speaker
I do think, though, that people over-pivot, especially when they're first starting building as an actual production system, they over-pivot on like their prompts. ah You see people on Twitter like, you know, this 20-line MD file is going to change the way your whole system works. it's like putting don't do this in all caps.
00:28:03
Speaker
has There's no guarantee that it's not going to do that. Like you just said, like make sure that you return JSON all caps. And it's like yeah, but I said it in all caps and it still did it the other way. like you really It is an ambiguity loop and you're going to get ambiguous results unless there's something probabilistic and and and and actual deterministic code that says this will only return JSON or, you know, like you you need a firewall, I guess is what I'm saying. yeah And the way you're describing that mask is it is a kind of a firewall. Like the machine's going to do what it's going to do, but we're not going to let bad results come out no matter what the prompt says.
00:28:39
Speaker
Exactly. it's It's the mix of using AI for what it's good at and using straight up Python for what it's good at. So like, for example, when I was proofreading the book, you would think models would be great at proofreading. They're not.
00:28:54
Speaker
And one of the things that makes them not good at it, though, is if you dump 47,000 words into a single chat window, ah it's it's not going to be able to go in and catch every error. But if you write a script that sends the book to the model a page at a time, ah you're going to get you know much better results.
00:29:12
Speaker
So there's there' there's that aspect of it as well as just like knowing how these systems work under the hood makes you better at using them and helps you build systems that get better results. See, then you would love the analogy that I know my listeners are sick of me using, which is the learn how to drive stick shift.
00:29:32
Speaker
If you learn how to drive stick shift, you have a different relationship with the vehicle. And I think a book like this is very much, here's how the internal combustion engine works. And now that you know how to drive stick,
00:29:43
Speaker
go back off and think about how to move people around and think about the concepts around transportation, but always have the engine and what's happening in inference engineering underneath so that you you you you know what you can actually affect in the larger system.
00:29:57
Speaker
Scott, that would be a fantastic metaphor, but I don't know how to drive. so Oh my God, you're killing me. Big Waymo guy. We're ending. we're and See, this is the thing. I love this. We're ending the show where you're a Waymo person. But see, the system requires you to know how to drive the way. So the Waymo breaks down.
00:30:15
Speaker
you're going to get eaten by the zombies first is what I'm hearing. And I'm going to jump in. I'm going to hotwire the car, throw it into first gear and zip away. That's true. That's true. But at least i I understand when the Waymo makes a mistake why it's making it. So it it just you know gives me a little bit more empathy for the machine.
00:30:33
Speaker
This is why when the zombies come, we team up. We team up. It's a Philip and Hanselman partnership where I will drive stick. You will handle the Waymo. Between us, the zombies will not get us.
00:30:47
Speaker
you You're welcome on my zombie apocalypse squad anytime. I i do also bring about 20 years of martial arts experience. That's my, you know, primary zombie apocalypse contribution. Very good. Well, I have a black belt in Taekwondo, but I'm more of a slap fighter. So I will be standing behind you ah when the zombies come.
00:31:04
Speaker
Well, this is, this has been fun. And I mean, congratulations. This is a, it's hard to write a book. It's hard to ship a book and it's hard to get it into people's hands. And ah you all and the folks that you work with at base 10, like you you pulled it off. Thank you.
00:31:18
Speaker
It is very hard to get it in people's hands. I have made best friends with the members of my local FedEx office. Yeah, it is. It is hard work.
Distribution Challenges and Solutions
00:31:28
Speaker
4096 in San Francisco. Highly recommended. Shout out.
00:31:31
Speaker
Shout out the crew there. There's definitely a lot to learn. And also just another thing I just wanted to say, like, I really thought that the QR codes at the end were just such a nice touch. I just hate the idea that I'm going to get a book and have to type the yeah URLs in. And there's a huge appendix. Like, honestly, there's probably 50 pages of just good, like there's a glossary. It's just a thoughtful thing. So yeah, maybe the haters say that it's it's it's not deep. I found it to be just the right amount of deepness.
00:32:00
Speaker
as a great survey for inference engineering. And I enjoyed it very much. And I appreciate you sending me a copy. Well, fantastic, Scott. that That means a lot coming from you. I'm glad. So you can go ahead and check them out at base10.co. That's B-A-S-E-T-E-N.co. Right at the top there, you can get your copy, get your digital copy. And then there's a wait list now for the paper copies because they're hard to get and they're super popular.
00:32:25
Speaker
They are. And Amazon, despite the fact that I have, you know, thousands and thousands of people now signed up to get paper copies, ah doesn't seem to want to list it. So we're going to, we're going to have to find a way around that.
00:32:40
Speaker
Well, hopefully you'll figure it out soon. Thank you so much, Philip Kiley, for chatting with me today. Thank you, Scott. Have a good one. This has been another episode of Hansel Minutes, and we'll see you again next week.