Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Beyond AI Hype, What Will Developers Actually Use? (with Zach Lloyd) image

Beyond AI Hype, What Will Developers Actually Use? (with Zach Lloyd)

Developer Voices
Avatar
0 Plays2 seconds ago

If AI coding tools are here to stay, what form will they take? How will we use them? Will they be just another window in our IDE, will they push their way to the centre of our development experience, displacing the editor? No one knows, but Zach Lloyd is making a very interesting bet with the latest version of Warp.

In this deep dive, Zach walks us through the technical architecture behind agentic development, and how it's completely changed what he & his team have been building. Warp has gone from a terminal built from scratch, to what they're calling an "agentic development environment" - a tool that weaves AI agents, a development, a shell and a conversation into a single, unified experience. This may be the future or just one possible path; regardless it's a fascinating glimpse into how our tools might reshape not just how we code, but how we experience programming itself.

Whether you're all-in on agentic coding, a skeptic, or somewhere in between, AI is here to stay. Now's the time to figure out what form it's going to take.

# Support Developer Voices

- Patreon: https://patreon.com/DeveloperVoices

- YouTube: https://www.youtube.com/@DeveloperVoices/join

-- Episode Links

- Warp Homepage: https://warp.dev/

- Warp Pro Free Month (promo code WARPDEVS25): https://warp.dev/

- Previous Warp Episode: https://youtu.be/bLAJvxUpAcg

- SWE-bench: https://www.swebench.com/

- TerminalBench: https://github.com/microsoft/TerminalBench

- Model Context Protocol (MCP): https://modelcontextprotocol.io/

- Claude Code: https://claude.ai/code

- Anthropic Claude: https://claude.ai/

- VS Code: https://code.visualstudio.com/

- Cursor: https://cursor.sh/

- Language Server Protocol (LSP): https://microsoft.github.io/language-server-protocol/

# Connect

- Zach on LinkedIn: https://www.linkedin.com/in/zachlloyd/

- Kris on Bluesky: https://bsky.app/profile/krisajenkins.bsky.social

- Kris on Mastodon: http://mastodon.social/@krisajenkins

- Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Recommended
Transcript

Introduction to AI Hype and Impact

00:00:00
Speaker
I'm slightly scared to introduce this week's topic because we're going to talk about a field that is massively hyped at the moment. And I normally have very little appetite for hype. And yet, once you get behind the hype of AI and LLMs, there is undeniably some serious substance here.
00:00:18
Speaker
I personally have seen Claude chew through a day's work in 10 minutes and do it well enough that I can put in 10 minutes of my own tweaks and merge it.
00:00:29
Speaker
20 minutes in total for a day's work. Once you've seen that, you can't unsee it. Pandora's box is open. Like it or not, these tools are going into our toolbox in some form.

AI in Development Tools: What Form Will It Take?

00:00:41
Speaker
And the question for me is, what form will that actually take? Is it going to look like a chat box that hides in the corner of your IDE, trying not to remind us all of Microsoft Clippy?
00:00:53
Speaker
Is it going to be a web form that's conversation first and code only when necessary? is it going to look like a terminal app like Anthropic have been working on and Google have kind of shamelessly copied from them?
00:01:06
Speaker
It's a technical question, but it's also a software design question and a human design question. These tools are coming, but how are they going to fit into the way we want to work?
00:01:18
Speaker
What's a good fit for the human-computer interaction?

Guest Introduction: Zach Lloyd from Warp

00:01:21
Speaker
Joining me this week for an answer is a returning guest, Zach Lloyd of Warp. He was excellent when we had him on last year talking about how you build a terminal from scratch.
00:01:32
Speaker
And since then, that terminal has warped into a vision of... agentic development for the future. Something similar to the other way as people are doing it, but quite different and quite unique in its own style.
00:01:46
Speaker
So I thought we'd get him in and we can pick apart how AI is going to affect programming generally and how he thinks it's going to work with us humans specifically. I'm your host, Chris Jenkins.
00:01:58
Speaker
This is Developer Voices. And today's voice is Zach Lloyd.

Warp's Transition to an Agentic Environment

00:02:14
Speaker
I'm joined once again by Zach Lloyd. Zach, how are you doing? I'm good. Thanks for having me, Chris. Pleasure. Pleasure as always. um i i was just thinking back because we last spoke almost exactly a year ago.
00:02:28
Speaker
you And we were talking about warp and we were talking about what it takes to build a terminal and GPU shaders written in Rust and going through ANSI escape codes and all that good stuff.
00:02:41
Speaker
And then towards the end, we touched on a little putting LLMs into a terminal. And i i looking back, I thought at the time, 12 months is a long time in programming, 12 months ago, i thought this is...
00:02:58
Speaker
This is probably a really useful thing. I can never remember the flags to set either. But a lot of LLM stuff is a gimmick. i I didn't know how much it was a useful thing versus how much it was every startup has to say they're doing and AI now.
00:03:15
Speaker
ye And we fast forward 12 months and my opinion of AI and agentic coding has completely changed. I'm absolutely certain yours has.
00:03:26
Speaker
Yes. So take me through what's happened to WARP in the past 12 months. Yeah, you know, at this point, we don't even describe Warp as a terminal anymore. We describe it as agentic development environment.
00:03:39
Speaker
the You know, you you can you can use it as a as a terminal, and a lot of users who've used Warp for a long time use it as a terminal, and it works great as a terminal. But what's happened in the past 12 months is that um the LLMs have advanced to a point where it really makes sense for a lot of development tasks to...
00:04:00
Speaker
approach them with like an agent first mindset. And what that means is like, instead of sort of coding things by hand or writing commands by hand, what developers are increasingly doing is starting every task by prompting an LLM.

LLM-Enhanced Development: Why Warp Fits

00:04:16
Speaker
And it turns out um that the terminal interface, the terminal form factor is actually a great place to do that. And you can see that in Warp, you can see that in the rise of a bunch of CLI-based agent tools like Cloud Code, Codex, Gemini CLI.
00:04:39
Speaker
The world has totally changed. within a year where, um you know, just sort of like the approach to how you develop, whether it's like, you know, setting up a new project, writing code, deploying, debugging production.
00:04:56
Speaker
um It's just much easier and more effective to do it by, know, starting ah the task by asking LLM to help you to do it. And so, you know, Warp, we've we started as a terminal. It's it's a great terminal. um But the the form factor and, you know, the UX that we built around the command line works extremely well, not just for running commands, but for launching agents. And since we think that's the future, that's like where we're placing our bet right now. And that's, you know, it's adding it.
00:05:31
Speaker
ton of value for people who embrace the ah the new way of working. Is that something you think you saw coming 12 months ago? Or is it just like, oh, we're in the perfect place to pivot?
00:05:44
Speaker
12 months ago, yes. ah Two years ago, no. So 12 months ago, we launched, um we'd already, i think, launched Agent Mode. And there's a lot of things called Agent Mode in the world now. Actually, Warp was the first to name something, Agent Mode.
00:06:00
Speaker
And the use case for us back then was like, um you know, instead of just typing a command, you can type in English. um And the...
00:06:12
Speaker
the LLM would interpret what you were doing in the same terminal input. So we we had this a year ago. The primary use cases that it was really good at, at that point, was what you would think of as like terminal things. So like doing stuff with Git or Docker or your AWS CLI. um What has really changed for warp in the last sort of three to six months is that warp is now excellent at coding.
00:06:41
Speaker
ah which is which was a big sort of effort on our part. We, you know, not to brag, but we're now sort of, we're top five on Sweebench, which is the measure of quality of coding agents.
00:06:54
Speaker
And we're number one on Terminal Bench, which is like the measure of, ah you know, how good is your agent at doing ah terminal type things and coding things from the terminal.
00:07:07
Speaker
So we we really went all in on that. We added a code editor within warp, which it's not an IDE at all. So it's not like the primary interface of using warp is that you open it up and you have a bunch of like static file views. It's a code editor for adjusting and editing code that an agent has produced, which is what we think code like is all kind of all that's needed in this new world.
00:07:34
Speaker
um So I would say, like to answer your original question, a year ago we sort of saw this

Integrating LLMs into Development Tools

00:07:39
Speaker
coming. Two years ago, definitely not. like like We have definitely, and you know the company itself is five years old and started with the the product vision of just like, let's make an incredible command line UX, which I would say we've we've gotten kind of lucky that that UX is actually so well suited to the the new world of LLMs.
00:08:01
Speaker
Yeah, you do seem to have landed in a place that's definitely near the starting line for this new race, right? Yeah. yeah like And I think we have um a really unique position in the landscape of...
00:08:15
Speaker
um developer tools right now in the sense like ah there's sort of like three approaches to helping developers work with LLMs in their daily workflow, or maybe four. There's the IDE-based approach, which is, i think, still the biggest. That's like co-pilot and cursor and... Yeah, I want something invaded embedded in VS Code. Yeah, it's embedded in VS Code. And there's there's really a lot of these now. And, you know, they tend to...
00:08:42
Speaker
They've come to market with a great auto-completion experience and are now adding on in a chat panel an agent experience. So that's one approach. um There's a second approach, which is the CLI tools that I mentioned. So like Cloud Code, Gemini CLI, which are literally CLI apps that you run within a terminal.
00:09:02
Speaker
um Then there is warp, which is most similar, I think, to the CLI approach, except that ah we are the terminal. And so it's just baked into the app itself rather than running an app within warp.
00:09:16
Speaker
And that gives us... you know, a bunch of advantages in terms of what we can do with the user experience that you can't do if you're just like a text-based terminal app, like having a code editor and a code review experience and like having a UI for seeing what all your agents are doing.
00:09:31
Speaker
So Warp's kind of like the only one that I'm aware of that has that approach. And then there's a fourth approach, which is sort of cloud agents. um So that would be like Devon,
00:09:43
Speaker
in maybe a company called Factory where you sort of have these like agents that don't really aren't being started by ah human necessarily, or they're being started in Slack or something like that.
00:09:55
Speaker
And ah they're not like in it's not like there's a proper developer workbench for them. It's more like they're autonomously doing things in the cloud. So those are the approaches that I'm aware of to this new world.
00:10:08
Speaker
And do you think like... Are you trying to capture a particular mindset? I mean, like, I guess you're going after someone like me. I'm mostly in the terminal, flipping between VI, and I have been using Claude code a lot.
00:10:23
Speaker
Okay. um um Am I a target? I must be a target customer. Is that like the persona of developer you're looking at? Yeah, I don't think of it as much as like, are we going after terminal users or IDE users?
00:10:37
Speaker
um We're going after... the general pro developer population that is embracing this new way of developing by prompt rather than developing by hand.
00:10:51
Speaker
um For people who really like the terminal interface, I do think we have like a natural advantage there and that, you know, that's like our DNA as a company is like,
00:11:03
Speaker
really focusing on the command line and command line users. But um even if you look at the sort of IDE-based approaches to agents, they the IDEs like Cursor, Windsurf, they're basically rebuilding a bunch of terminal functionality inside of a chat panel, inside of VS Code.
00:11:24
Speaker
And so I think everyone, to some extent, is going to be working in this new way. And so i don't really think of it as terminal versus IDE. I think of it as like, we're trying to build an app that supports um this new workflow of of sort of agent first or prompt driven development. And i think I think it's applicable to all developers. You don't have to be like a terminal aficionado to one um want to work this way.
00:11:51
Speaker
Yeah. yeah Do you know, that makes me think we have to get into this angle of it because yeah the thing that's the thing that pushed me over the edge with these things is that the LLMs themselves have got better yeah in the past year.
00:12:04
Speaker
But what's really pushed me over the edge is now they can integrate with tools and start gathering information about the local context and then... so Running a grep command like a real human would and getting the output of that and feeding that in and It's it's not just an LLM. It's a network of planning LLMs and implementate implementing LLMs and tooling Right combined together and is I'm just wondering how have you built that

Technical Challenges of LLM Integration

00:12:32
Speaker
out? Have you taken an off-the-shelf LLM and like mashed it into a bash shell or what's the actual architecture to make this work well?
00:12:41
Speaker
Yeah, so the actual architecture, um I guess I could walk through like the life of an LLM request or like a conversation in Warp. I think it might be interesting for folks.
00:12:52
Speaker
So let's say you um you want to build a feature using an Warp. build a feature ah using an lllm in war The way it will work is in the same place where you might type a terminal command, you can type a prompt.
00:13:09
Speaker
And so let's say that prompt is like, hey, I want to build a feature that's a new dropdown menu in the app that does XYZ. You can literally type that or you can literally speak it.
00:13:22
Speaker
And so the first thing that we'll do is like detect, OK, is this a prompt or is this a command? Let's say it's a prompt. If it's a prompt, then that prompt goes up through um goes up through our server.
00:13:36
Speaker
ah It's sort of the the user part of the conversation. The user prompt is combined with the system prompt. We also, at that moment, will sort of look to see if you have things like rules to defined. And so rules are you know persistent things about like the coding conventions in your code base or stuff like that that needs to be added to the context.
00:14:01
Speaker
We will um look at the... ah basic or or basically at that point, we'll so we'll start the conversation. And it in the conversation, we have registered with the LLM a bunch of tools that it has available, right?
00:14:18
Speaker
And so the tools that the LLM has available when working with Warp are like the kinds of things that you've mentioned. So um it has a tool that where it can run terminal commands.
00:14:29
Speaker
So that's like kind of the main tool. So if it wants to gather context by like running grep or find or running git commands, all that stuff, it has that tool available.
00:14:39
Speaker
It has a tool available to do um semantic search over your code base. So this is like... um you know we have a vector embedding of code base that lets us ah take the relevant part of the prompt and figure out like well what files are likely to have relevant information ah concerning that. So it has that tool. Okay, we've got get into that in detail, but let's carry on for now. So that it has a vector embedding. ah It has a ah the ability to execute, ah to use MCP servers. So if as a developer you have registered MCP, just to explain what MCP is, in case people don't know, it's model context protocol from Anthropic.
00:15:22
Speaker
This is a way that um the LLM can call external services to gather information. so you know for For Warp, for instance, we have MCP server set up for access accessing our task tracking system or accessing Slack or Notion or our crash reporting system.
00:15:40
Speaker
So the LLM can decide to um to invoke that tool. um It has a tool for generating a coding change.
00:15:50
Speaker
So this takes like a file and like instructions and returns a diff, and then ah we have logic internally for applying those diffs. So think of it as like there's this huge variety of tools that the um the LLM has access to, and depending on your prompt, it can invoke those tools.
00:16:08
Speaker
And so let's say it wants to invoke like the git tool or sorry the run command tool, and it wants to call it git status command. What it does is it passes that information back to Warps client from the server and it can it will make that ah tool call on the client. In this case, it'll look ah look at the output of it and pass that back.
00:16:29
Speaker
And so that's like the general architecture. Our server is talking to the foundation model LLMs typically. um And so we we use LLMs from all of the providers.
00:16:41
Speaker
Currently, um we think that the the best general purpose model for development tasks is is Claude, either Sonnet or Opus. But we also use models from OpenAI. We use models from Google.
00:16:54
Speaker
It's up to the user to some extent. So if the user wants to be like, hey, I want to make my default model Gemini, for instance, they can do that. um And so the...
00:17:05
Speaker
That's like the underlying driver of a lot of this stuff, but there's a lot of logic in code on our server for sort of coordinating how we get all the right context and how we prompt the model and then how we execute the tool calls that come back from from the model. Does make sense?
00:17:21
Speaker
Yeah. I'm wondering to how some of that... I'm surprised you don't you're not doing any training on top of those models, and I'm wondering how your rule source at the back end works. Yeah.
00:17:32
Speaker
So we, the short answer is like, we are doing some stuff like that where we fine tune and train and, um, uh, try to like use data to improve the experience. um we, uh, but the foundation models themselves, like we're not, we're not trying to least currently build sort of like competitive, the foundation model. We are relying on,
00:18:00
Speaker
ah competitive ecosystem of these foundation models, whether it's like, you know, it'll be interesting to see what happens with Llama and Grok and Gemini and Anthropic.
00:18:11
Speaker
So we we think that we're just to too small. It's too capital intensive for us to like build ah build a model from scratch. But we are doing things where we where we try to fine tune or build classifiers for certain types of certain types of features in the app.
00:18:27
Speaker
And how do you like, How do you assemble the right prompts to get, and how do you parse the right output to say, because i'm because I'm assuming like if you go to Claude, they don't know anything about your warp.
00:18:41
Speaker
This is how you invoke this thing model. Correct. So do make that work? So i guess one important thing to point out is that these um foundation models are all so stateless APIs, right? So they take, um you know, tokens in and they give you tokens out. And that means that every request that you make to them needs to include all of the all of the relevant context.
00:19:05
Speaker
And so we um you know, the way that we architect this is that there is like a ah system prompt that defines all of the tools that are available.
00:19:16
Speaker
And then as the um as the user continues like the conversation, um that all of that context that is in the conversation is sort of incrementally passed up to the LLM every time. So you could think of it as like, okay, we start with a system prompt, then there's a user query, then there's a response from the LLM, then that response might invoke a tool, which would produce output, which would then get sent back to the LLM.
00:19:45
Speaker
And it could continue going like that until, like with successive tool calls, until maybe, hey, the LLM decides that it needs more user input. And at that point, the user would type something or speak something.
00:19:56
Speaker
Then that whole thing goes back to the LLM. And then the the trickiness with this is like, if you think about that from like a, um, just like a token usage perspective, it's kind of like a, ah kind of grows, I think, N squared. And so it's like you have to do things to ah limit the usage. And one of the main techniques there is something called prompt caching, where if you keep the prefix the same, it's a lot less expensive to call the models. So there's like built-in stuff with the model layer to help with this.
00:20:29
Speaker
And then we have to things- Hang on, slow down. You've got to explain that a bit more for me. I didn't quite get that. ah Okay, so so let's say you are you have a conversation going. And so yeah um it's a system prompt plus user message one plus the LLM response plus user message two plus the LLM response.
00:20:48
Speaker
You can think of this as like what you're doing is you appending messages to to a um to like a prefix of your existing conversation so far.
00:21:03
Speaker
So if I say hello and the LM says hi, And then i say, how's it going? And the LLM says, it's going fine. um you There are optimizations on the side of the LLM that make it so that it can kind of remember what the conversation is so far.
00:21:21
Speaker
and And so this is like a ah big like cost and latency optimization that all of these LLM providers support is that you don't have to continue working resending all of the conversation so far so long as nothing in it has changed.
00:21:37
Speaker
Does that make sense? call that prompt caching. you Is it somehow like you're taking you're associating a prefix with a certain activation state in the neural network?
00:21:49
Speaker
Yes. I i don't actually don't know how it's implemented on their side. I can only speak to like the being like an API you yeah um but but Sooner or later, it's a black box to everyone.
00:22:03
Speaker
At some point, it's a black box to everyone. And um so so that's like a thing that that is is tricky, is managing this prior conversation context. Another thing that's interesting technically, as long as we're talking about this, is like there is a um you know there's a limited context window associated with all of these things.
00:22:24
Speaker
models, which means like you can only pass a certain number of input tokens that it can have like attention over. um And so, you know, for anthropics models, I think this is currently 200,000 input tokens for Gemini. It's a million.
00:22:44
Speaker
and, And sometimes conversations just have more than that, more tokens than that. And so what happens if you take a very naive approach is that the LLM starts to like forget what's been talked about so far.
00:22:59
Speaker
Yeah, I've seen that happen. Which is a very annoying thing as a user if you hit that. And so you have a choice as a someone building on LLMs.
00:23:10
Speaker
Do you... give the user control over like, hey, i want to reclear the context window or i want to like like I want to summarize stuff that's in it, or do you try and do it ah for the user? And Warp's perspective for the most part is that the user shouldn't have to think about this. And so you know we will do things when you get close to the limits of the context window like,
00:23:37
Speaker
um like summarizing and compacting the conversation so far, we will try to truncate um sort of what we think are like irrelevant sections of ah like long inputs.
00:23:49
Speaker
And you can get very long inputs just because like, Let's say you, as a user, are running a server and like it's spewing out logs.
00:24:00
Speaker
It's very easy and warp to attach all of that as context, but that might not all be relevant and it might confuse the LLM. So there's a lot of very, very interesting engineering that goes into making the experience work.
00:24:14
Speaker
work really well for users. And a lot of it is around like managing how you do the prompting, managing how you do the context, figuring out what models to call in what situation. Cause so there's another variable, which is like these models all have different performance characteristics in terms of like latency and quality of responses. So it's, it's a really interesting engineering challenge to make the system work well.
00:24:37
Speaker
And the, the, another big, uh, like change in our engineering mindset has been that the only way to do this effectively is by, ah constantly measuring and evaling and making sure that any change we make, uh, doesn't regress things. You have to look at data and evals and like, you know, look at like user acceptance rates and stuff.
00:25:01
Speaker
What are you, what are you measuring?
00:25:05
Speaker
So there's a whole um there's a whole like set of like public evals, which are a really good starting point, which I mentioned earlier. So like TerminalBench and SweeBench. And those are, um they are sort of like, they provide um some sort of task description.
00:25:25
Speaker
So a typical SweeBench task might be like, hey, there's a bug in this database. random like open source Python repo. Here's a description of the bug can, and here's a test that verifies whether or not it's fixed.
00:25:41
Speaker
Can your agent fix it? Or for terminal stuff, it might be like, um, here's like some messed up get state. Well, can your, here's what it should look like if it's verified, can your agent fix it?
00:25:55
Speaker
And so, um, And it's not pure regression tests because some of these things are beyond the ability of the best agents right now, period. So like Terminal Bench, for instance, we're number one on that, but we're only solving 52% of the problems with it.
00:26:12
Speaker
um In SweetBench, we have like 71% success rate and yeah know the top agent is like 73% or something. So it's like you you want to have things in this eval set that are too hard to currently do so that you can measure progress.
00:26:29
Speaker
So there's the public benchmarks. And then there's another set of things that we do, which are like when we have bad experiences or our users have bad experiences or good experiences, we have an easy way of taking ah real conversation, sort of anonymizing it and turning it into a,
00:26:46
Speaker
ah like a regression style eval or our own eval set of what. um And so measuring, like as we make changes, are we moving that number up or down is really important.

Evaluating and Assessing LLM Tools

00:26:58
Speaker
So that's one approach. The other approach is looking at more like a telemetry based approach where you look at, um, you know, a really simple thing would be like, we have a thumbs up, thumbs down on our ah agent interactions in production. And so you can look at the rate of that.
00:27:16
Speaker
Or for coding changes, you can look at the rate of diffs that users accept. And so looking at real user data to figure out, are we improving or regressing?
00:27:27
Speaker
Ideally doing that in a way that's like A, B tested. So it's not just like so try to try to limit variables and do in like a statistically significant way. But that that's a big mindset change because like for most of the time building warp, we were building more like you know, ux or we're building more deterministic features.

Economic and Emotional Impact of LLMs

00:27:49
Speaker
Let's put it that way. Whereas when you're working with the models, it's really not easy to tell if something is getting better or worse as you're making these changes because there's this sort of like stochastic nature to them.
00:28:01
Speaker
Yeah, it's not until you push out into the large and see how the crowd responds to bunch of changes. Something like that, yeah. That must be really hard. Also, your your computing budget must have gone through the roof.
00:28:12
Speaker
So we are spending a lot of money right now. just like And to a certain extent, that's a good thing. It means people are... um engaging with with the agent features in Warp really heavily, which is very, very cool to see. But it's not like a SaaS business where, you know, I think a SaaS business can have like a 90% gross margin or something like that.
00:28:38
Speaker
like It's an expensive proposition to run all this. so we like Every month we send a big check to Anthropic and OpenAI and Google. And then i assume they send a big check to GCloud, whoever's running their thing, who then writes out a big check to NVIDIA. And so there's like... It all funnels down.
00:29:03
Speaker
there's It all kind of funnels down. And so it's it's a really interesting... ah sort of business problem of like, well, where's the value at in this whole architecture? Is it the app, the model, the data center, the chips?
00:29:19
Speaker
Who, you know, the power companies, I don't know, like, I don't think it's the power companies, but it's, it's pretty, it's, it's a very, very expensive proposition to produce and consume like these intelligent tokens.
00:29:31
Speaker
But it's, it's amazing what it can do. Like if you're using cloud code, I think you're, you're getting some of that experience where it's like, wow, I'm just telling the computer to do this thing and it's kind of doing it for me. And that's pretty exhilarating.
00:29:44
Speaker
Yeah, i I found it extremely depressing for a week. Oh, really? it like Because I hit this point with Claude. it's like I finally tried it out when someone was recommending it to me. And i thought, I realized that, okay, this isn't as good as I can write by hand, but it's certainly as good as I would accept as a pull request.
00:30:02
Speaker
Interesting. And it's like, I spent the next week after that thinking, this is the end of programming. what My career is over. What's happening? And then ah week after that, I was like, actually, no, this is really exciting. This is a new frontier. Things are going to change. We get to build more stuff. It's going to change. which's like like i mean, I'm curious what you think about this, but my take on that particular um point is that at least today...
00:30:32
Speaker
um you you need to to know what you're doing if you want to use these tools on production code bases and produce like, you know, high enough quality code to submit.
00:30:45
Speaker
And it kind of, um it requires a bunch of engineering knowledge and like the actual like,
00:30:55
Speaker
like thinking of how to use these things is kind of fun for me. And the success rate I find is much higher right now, at least if you don't just tell them the the outcome that you want, but you tell them how you want it built.
00:31:12
Speaker
And so you get to, you know, be an engineer, you know, in the sense of like, hey, this is what I want. the model to look like. This is what I want the API to look like. This is how you should pipe through the data and this is what ah you know the UI layers should look like.
00:31:27
Speaker
But you're just not typing as much. yeah And you can do multiple of these things at once. It does make me like a little...
00:31:37
Speaker
dizzy to do all the multi-threading with it, like to have like multiple, like, but that's how I'm working now is like, if I'm really going to spend a day programming, I'm not going to sit there and watch one agent do stuff for me. I'm going to do two or three and I'm going to, you know, I'm going to think hard about how I asked them to do the task, but it's, it's just like a new kind of puzzle solving for me. So I still like it, but I, you know, I don't know where it goes like a year from now. i couldn't, I couldn't tell you right now that I think it's still engineering. It's still pretty fun.
00:32:07
Speaker
it Yeah, it absolutely is. And it's a bit more like managing a team of junior developers. Yes. Rather than having your hand on every line of code. Correct.
00:32:17
Speaker
And like just like with junior developers, it's like, you don't want to give them total free reign. You really want to guide them and be like, hey here's what a good design for this would look like.
00:32:28
Speaker
ah Check in with me. Let me see it as you do it incrementally. Don't give me some huge mess of code that you wrote to review. Like, ah you know, you want to teach it to understand the system. And so it's it's kind of like everyone becomes a tech lead in some way. Yeah, yeah. Now, I think this is i think it's hard for people
00:32:50
Speaker
who are early in their career. I worry that they won't develop the the skills that you need to be an effective tech lead. I think that's a challenge.
00:33:04
Speaker
And paradoxically, i think if you if you, as a senior engineer, kind of know how to architect these things, like I think you can, like, they should be the real power users of this technology, but I actually find that many of them are reluctant adopters because,
00:33:20
Speaker
They're like, the code it produces is not very good. And the um and they can do it faster themselves. So it's it's a little bit, I'm seeing way more enthusiastic adoption, period, by people who are non-developers than I am by people who are pro-developers.
00:33:36
Speaker
But i think that that's I think it has to change and will change. Yeah, i I would expect non-developers to be delighted by the power they've suddenly got in their hands. Absolutely, it's like a magic wand. It's like, and I couldn't do this thing and now I can...
00:33:52
Speaker
build anything is, and if you don't care how it's built, then I think like, great. But I think the problem is with with professional software engineers, we we do care how it's built.
00:34:03
Speaker
And it matters how it's built. Yeah, exactly. But I've had very good success with saying like, okay, so go and read the spec document and write ah the core data types for this project.
00:34:14
Speaker
And then it gives me them and I correct them and get them exactly how I want them. and it's like Like an old program, you get the core types exactly as you want them and everything else just flows a bit more naturally.
00:34:26
Speaker
And you can do that. A hundred percent. Yeah. And that that's, that's you being an experienced engineer knowing like you're, you're doing it the right way, in my opinion, if that's how you're doing it. So it's like you tell it, let's go one step at a time,
00:34:39
Speaker
build me this core component, let's write tests for it, let's make a commit for it, let's get that part right, and then we'll go on to the next thing. Where what I see a lot of like in it like non-developers do is like be like, build me the whole app, or like just build me this feature. and And like it will do it, ah it just and that'll work for certain applications, it just won't work for large production-scale applications. I think it will work fine for build me a app that tracks when I last fed my baby or something like that. You know, like first personal software. You had to pick and had to pick an example that was life or death, didn't you?
00:35:20
Speaker
Yeah, that's funny. Maybe that's a bad use case. Maybe you go for that. You'd be like, ah, well, the app says I fed it an hour ago.
00:35:30
Speaker
You know, whatever, something like that. is I'd like to remind listeners that this is not legal advice. Yeah. Yeah, don't build that app. Yeah. But it's exactly the same as if you hired junior developers and said, go and build the entire thing from scratch with no guidance. If they were enthusiastic enough, they'd get it done in a form, but they need that guidance.
00:35:49
Speaker
Yeah, it kind of reminds me of like in my life when I've hired um just sort of contract developers who... don't know the code base, and you give them a task, and they you know they're like they go off, and like you check in with them the next day, and they they show you the thing, and it looks like they built it. and it's like, whoa, they built it. And then you look at the code, and you're like, ah this is never we're never going to be able to merge this. And so ah yeah, there's there's there's that's the state of the art today. i think it'd be really interesting to see
00:36:26
Speaker
if the models can get good enough, and I would bet that they they can, so that not only are they getting the outcome right, but they are building in a way that a professional developer would actually build within a real code base.
00:36:44
Speaker
like To me, that's that's the chasm that has not been crossed. But the rate of improvement, and I think that's that's the the problem that they're probably focusing on at these labs right now, is like don't just produce outcomes, produce like good good code, I think.
00:37:01
Speaker
like That's what I want. um I think i will probably see that. I would be surprised if we didn't. I mean, it's inevitable it will get better from here. But even if even if all development froze right now, it would still be changing the world of programming. Absolutely. I think it's it's the biggest...
00:37:19
Speaker
change in software development that I've seen in my life, you know, it's like, I don't know what the right analogy is, but it's it's kind of like going up a level of abstraction from, you know, I think like people before me, it's like you coded an assembler and then you coded in C and like, whoa, that's a big difference.
00:37:37
Speaker
yeah And then even from like C to coding in TypeScript, that's also a big difference. But I think this is an even bigger deal because like, you're just a you done you're not having to write code so much anymore, which is wild.
00:37:52
Speaker
It really is. It made me think recently was like the people wondering how junior developers are going to learn things. I worry about that too, but at the same time, i can imagine people with the release of the calculator saying, how are kids going to learn to use the slide rule?
00:38:08
Speaker
Maybe it won't matter so much. Right. Right. And so i think the optimistic case here is that that this... freeze. um Yeah. This kind of like creates a world where it's everyone can build and what matters is like what you're trying to build. Is it solving a problem for someone? is it, um you know, yeah it kind of just like lowers the the barrier for people to build amazing stuff, which I think is, is pretty cool. Like I think in in ah in a macro sense, that's going to be really good for,
00:38:45
Speaker
humanity for the economy, for the ability for everyone to be able to just, um, not have this barrier when it comes to building software. yeah i also think, yeah if I may, I think it's going to raise the barrier for what professional programmers can do as well.
00:39:01
Speaker
Correct. Because I can get someone else to write really good documentation for me and just check that out. I can spin up a website to promote the thing I'm building more easily. Exactly. It sort of accelerates everything. if If you're a pro who really likes writing code and solving hard problems, I think that there's a an optimistic case for this as well, which is like,
00:39:21
Speaker
um, yeah, you get to focus on the cooler stuff and like you get to multiply yourself and get more done. Um, so yeah, that's, that's the bull. That's the, the optimistic case. There's definitely a pessimistic case here, which is like, you know, uh, if I've spent my life honing these skills around like writing code and all of a sudden writing code is not, is like kind of automated.
00:39:46
Speaker
Then, um, what did I like? what What do I do? i don't think we're there yet. And i I think that there's like a way to like, I've never thought that the essence of engineering is writing code. I think it's problem solving. So I'm more on the optimistic side here. But yeah, there's like, there's definitely real risk yeah yeah on the pessimistic side.

AI's Transformational Effect on Software Development

00:40:09
Speaker
I'm generally erring on the side of optimism. I think it's going to take us a while to readjust, but we're going end up building more cool stuff. And that's what I'm here for. Yeah. Yeah. But okay, so we're getting too much into speculation for my liking. I want to bring it back to hard tech.
00:40:23
Speaker
but One of the things that gives these things much better results today is the amount of context and relevant context they've been given. So you said you've built a semantic layer for indexing code bases.
00:40:39
Speaker
Correct. Sounds like that would be a perfect thing for giving good context, but what is it and how does it work? Yeah, so at ah at a super high level, what it does is we take um code files and and kind of vectorize them. Yeah.
00:40:57
Speaker
So they go into a ah vector database. I think we use something called Turbo Puffer. i didn't I didn't build this, but there's a bunch of different options. Like Pinecone is another one in this space. And when you you vectorize the code base, guess think of it as like putting all the code into some like really high dimensional space.
00:41:19
Speaker
And then when someone asks a question, about your code, it's say say it's like, I want to know where in the code this tooltip is defined, that also can be vectorized, that query.
00:41:34
Speaker
ah And you then do a, like, you try to find the files that are like the closest distance to that search vector, i think is right basically how this works. um there's There's people who know this stuff way better than me, so just caveat, but that's the general principle.
00:41:52
Speaker
And so we will then return ah from that vector search, basically the names of the relevant files And then from, but but but just the names. Like we don't we don't store any code on the server, just to be clear. So we're not like sucking up and storing someone's code base. We just have this like vector ma mapping of like vectors to names of files essentially. And then we take the names of those files and we, um,
00:42:20
Speaker
we basically then the contents of the file, and maybe not the whole file, maybe just the relevant section of the file, get we read that from disk and and send it in as part of the context to the LLM.
00:42:32
Speaker
that make sense? Yeah, yeah, okay. So you can find... You can find files that are similar to your query in a search space. Exactly. So the steps are, it's like you take the query, turn into a vector, you find the files that are most close to that vector. And I don't know the details of how we do it. If we break up the symbols within the files, there's probably- going to ask if you chunk it up at all.
00:42:57
Speaker
I don't know. sort' i didn't build it. I wish I knew. ah But there's there's basically at really high level, it's like we then, we're finding the files or the sections of files that are most similar um in the vector space. We're passing those files back, the file names back, essentially. Then we're reading the the current state of the file, and then that goes to the LLM.
00:43:18
Speaker
um And so there's... the There's things about this that are hard. um One of the harder things is just like there's a freshness problem because yeah if if the contents of the file change, and that that's happening all the time, right? If you're writing code, ah you need to recompute the vector because it's very frustrating for someone to like,
00:43:41
Speaker
have a PR that they're working on or where they they change a bunch of stuff and then they ask the LLM, like, hey, for this new function that I just added, can you do X? And then the LLM um can't find it. And it's like, that's annoying.
00:43:55
Speaker
um So that's ah that's one of the challenges. The other challenge, or not a challenge, but like, you generally don't want to use code-based embeddings as your only search mechanism.
00:44:06
Speaker
So if a user is... searching for specific symbols or like that like you can often just do better with grep and string matching yeah and uh it sort of scales better too across bigger code bases so warp warp uses a combination of these things uh to try to find the right spot that uh someone is talking about in their in their You've made me think that inevitably at some point in our future, someone's going to build into Git or a version of Git that also does vector embeddings.
00:44:42
Speaker
Probably. So that when I do Git grep, I can get semantic grep as well as text grep. Yeah, I mean, for all I know, that that exists. But yeah, probably something like that should should be should exist, yeah. Okay, that makes sense.
00:44:55
Speaker
So i wanted to I wanted to talk about the editor that you put into Warp, right?

Warp's Code Editor and Review Process

00:45:03
Speaker
Because this is another part of building out a change that's close enough, and then the human gets it all the way over the line, right? That's our that's exactly our view. That's the right way to think of it, in my opinion.
00:45:14
Speaker
Do you... How far are you planning to go with an editor built into the terminal? Because it sounds like maybe you're trying to rearrange the pieces of an IDE with this. That's right. I mean, we're literally calling Warp an ADE, like as a kind of homage. So um the way that... ah The way that I approach this is like, well what are the main workflows that require editing in this new world?
00:45:40
Speaker
And so we're not really building for a workflow of ah I'm a developer. I want to open up five files. I want to type in a bunch of code. I want to compile it.
00:45:50
Speaker
that's If that's how you're working, you should work in the code editor or the IDE. we're building editing capabilities for a sort of agent first workflow where you start off um by saying like, Hey, I want, I want an agent to make this change.
00:46:05
Speaker
And then you want to be able to step in and review and correct and guide the agent. And so, think of things like editable inline diffs, meaning like if the um the agent says like, hey, here's a change that I want to make as part of this.
00:46:23
Speaker
I'm to show it to you inline, meaning in like the middle of your conversation. um Does this look good? If not, do you want me to change it? And you can reprompt to change it. Or I could just be like, okay, you're not doing it right. Let me just like...
00:46:39
Speaker
Click in, edit what I want to be. Then you take my edits as context for the next thing that you do. So that's one modality. The other really big modality that we're building for, which I think is is super cool, is basically a code review UI for humans to review agent-generated code.
00:47:00
Speaker
in in the actual app that you're working in. And so, um you know, right now, i think this is actually one of the bigger shortcomings of like Cloud Code or or Gemini CLR or whatever, is that it makes changes and then you have to sort of like either do Git diff or you got to go into your editor or you got to go to GitHub or you got to go like Git tower, like whatever code review UI you use.
00:47:25
Speaker
And so we are um building a really, really nice UX so that you can always see and adjust all of the changes that the that the agent is making.
00:47:36
Speaker
And so those are the two main use cases, like adjusting in line and then adjusting for code review. And so the question is like, how much editing do you need for that? And so I think you need like, You need the basics.
00:47:47
Speaker
So you need like syntax highlighting, you need linting, you need line numbers, you need, um I think you probably need LSP integration for it to feel good, meaning like red squigglies when it's an error and like hover to see like the symbols and like jump to references.
00:48:06
Speaker
do you know LSP by the way? So ah language server protocol. Yeah. it's Yeah, the various ways your language can standardize on giving you feedback on the code. Yeah, so you need that. i don't Here's things that i think you don't need. i don't I think you probably don't need a mini-map.
00:48:25
Speaker
i don't think you need like the whole VS Code extension ecosystem. Maybe some some things you need. i don't think you need... like You just don't need like all of the knobs um that you get in like a first class code editor.
00:48:41
Speaker
You probably need some of them, though. I think you need vim keybindings. I was desperately saying that in case in case that annoyed you. But yeah, I do actually need vim keybindings. You need vim keybindings.
00:48:53
Speaker
there's like a kind of like an 80-20 approach where it's like, if if the editor is like, if I go into an editor and I'm like, oh my God, like i I can't jump to a reference. I'm like, screw this. I'm going to go to the IDE. And so we want to build enough of it so that you just don't feel the need to context switch out of warp to do these main workflows.
00:49:12
Speaker
But there's no world in which we're like replicating VS code um because we just, I think that's mostly not needed. like Most of VS code is geared around you know, like having tons of tabs of files open and file trees and that kind of stuff. And like, I don't think that's needed. i think like ah it's it's needed for some things right now, but won't be needed in the long run. So we're not really investing in it.
00:49:39
Speaker
How much- i think it gets in the way even. Like, I think it's i think it's distracting. Potentially. I've certainly started to feel like when you're dealing with agent written code, it is more like right reviewing a pull request on GitHub than it is co-editing.
00:49:54
Speaker
Yeah, that's how I that's how i feel. how much have you How much will you develop something like Git integration? Yeah, so we will... I'll literally just tell you what we're thinking. So we'll definitely have the um the ability to sort of like see see your current diff um against like you know, the, your stage and unstaged changes. And we'll also have the ability to work against the sort of PR level, which is how I like to work typically when I'm working with agents, like what's the the whole diff for the PR, but you'll be able to see the current commit. You'll be able to see the PR and then you'll have stuff where, um, you'll be able to easily open a pull request.
00:50:41
Speaker
Um, um, it's interesting We don't have to build this in any fancy way because it's so easy to just be like, ask the agent. like It can just literally be a prompt where it's like, can you use the GitHub CLI to open a pull request? And that's like our open a pull request button.
00:50:54
Speaker
And so as much as we can, we'll do things through the LLM. I think an interesting question that we will... you know, we'll probably make it so that you just don't have to go to GitHub at all.
00:51:07
Speaker
um I'm curious if people will want that, meaning like we'll probably build it in so that you can just like look at collect like your teammates' comments as well. um i don't know. That's not as high priority for us at at the moment as um just having a really, really nice way for a human to review the agent's code. That's like the top priority.
00:51:29
Speaker
That's fair, yeah. yeah ah There's one specific feature I wonder about in this whole pull request workflow, which is you want to you want to be able to see the diff, you want to be able to grab put your hands on it and make changes.
00:51:41
Speaker
But I also find you want to kind of inline comments and say, this bit's fine, but I want you to change this from a struct to an enum. Yep. So, I mean, this is exactly what we're working on. is like It's code review with an agent is not just about hand editing and correcting. It's about reprompting and like, exactly like you said, it's like, Hey, this part doesn't look right. Can you change it? Like, so figuring out the best possible user experience for doing that is something that warp is going to be really, really good at. Cause we don't, we don't have the baggage of, um,
00:52:20
Speaker
you know being in VS code, which isn't really built for that. And we don't have the limits of being a CLI app. So we're just building what we think is the best UI for doing exactly what you said, which is like, how do you tell the agent like, hey, this is partially right, but not totally right.
00:52:38
Speaker
Change this part of the diff, but leave that part. So that's that's like that's the key workflow we're trying to really get right. How much are using it internally? How much has it changed your development cycle in-house?
00:52:51
Speaker
I personally only code by prompt now. which is And and it's this is a recent development where i would say up until a couple months ago, because so warp is like over a million lines of rust code on the client.
00:53:05
Speaker
Uh, and we have a custom UI framework, like I discussed with you when i was last year, really complicated app that LLMs don't know anything about. Um, and so it's only within the last couple months where warp itself has gotten good enough that we can build warp within warp.
00:53:22
Speaker
Uh, and now on the team, every single change at least starts with a prompt. And that's like, um, There's a sort of like mandate around that. So it's like, uh, uh, I've, I've asked people to do that.
00:53:38
Speaker
Um, not every task is finished by prompting definitely. So there's a lot of cases where it's just faster still for someone who knows what they're doing to go into a code editor and probably not even in warp yet. Cause our, like our code review stuff, isn't quite good enough yet, but it will be good enough soon.
00:53:59
Speaker
where they have to go and and do things by hand. But every single programming test that we're doing right now is to build warp is starting with a prompt in warp and the completion rate. I don't know what it is, but it's getting higher week over week.
00:54:13
Speaker
And it's really incredible for, um, like there's certain tasks where it's just like you do it 10 times faster, especially if you're in a part of the code that you don't know well, or you are, um,
00:54:26
Speaker
you're building something zero to one. And so it's now like we're mostly you know working in this new style, not entirely, but mostly. That sort of matches my personal experience, how the way I'm writing software is changing.
00:54:42
Speaker
Yeah. Which is, what what's the what's the success rate? I mean, we've got to talk about whether you've noticed bug reports increasing or decreasing.
00:54:53
Speaker
Yeah, so sort of ah in in production, when we um when Warp suggests diffs, and this is like you know close to 100 million lines of code a week currently and growing very fast, it's like a 96% to 97% acceptance rate from the user on the diff that we suggest.
00:55:14
Speaker
That doesn't mean it's 97% right, just to be very, very clear. um You know, i don't I actually don't know... it's kind of a harder thing to measure of like, ah like I think the best proxy that we have is like, well what's the rate at which we need to change the diffs.
00:55:33
Speaker
Um, and I think internally it's definitely not 97%. So there's probably a bunch of you know, for lack a better word, vibe coding that people are using warp for where they're just like, yep, good, good. This looks good. I'll accept it. I'll accept it.
00:55:47
Speaker
Internally, it's certainly lower than that, ah but still pretty high in terms of the diff acceptance rate. um But I couldn't, I don't actually know the number off the top of my head. Okay, but I have to ask to get the full picture, right?
00:56:00
Speaker
do you Here you are, you're the CEO or the CTO, I forget which. CEO. CEO of the company. you your um Your code base as was is the goose that lays the golden egg, right?
00:56:14
Speaker
and Are you worried that you're going to wake up six months from now with a code base that's been vibe coded out of maintainability? Um, no, because we're, so part of the internal rules around using warped code or using any AI tool to code is like, you are responsible at the same level of, for, of code quality as if you had written it yourself.
00:56:39
Speaker
Like, and I think this is an important rule for teams that are adopting, agentic coding is like, it's never okay for someone to be like, well, AI wrote it seems right.
00:56:52
Speaker
You have to understand the code that's produced at the same level as though you wrote it yourself. And the code review standards, and I still think code review is probably our main, our best quality control mechanism, have not changed at all.
00:57:07
Speaker
um And so i i think, ah and also we have a bunch of like very, very senior experience engineers on the team who are not about to let sort of vibe coded stuff get in. So that that's that's why I think i think no. um I think we will be fine in that respect. I don't think that the quality of code is going down at all.
00:57:31
Speaker
It just, it does, we would we would kind of like short term go faster if we just accepted the vibe coded stuff. But I think to your point, we would long term go way slower. So we're just not allowing that.
00:57:42
Speaker
Right. Yeah. Yeah. Do you still have like, I mean, we're talking about it's becoming more like we're code reviewing a pull request from an agent. Do you still then have a second human being in the team review the actual final pull request? 100%, yes.
00:57:57
Speaker
So the workflow is like, let's say I'm building a feature. So I'm going to start that feature by asking an agent to do it. Then i'm going to code review what the agent did. Then when I think that looks good, I'm going to send it to...
00:58:10
Speaker
you know, David on our team to code review that as though I wrote it. And if David doesn't like it, he's going to be like, this is not good. ah You know, make changes. So it's, it's not like we've gotten rid of a ah like a level of peer review for humans at all.
00:58:28
Speaker
In fact, I think it's more important in the world where it's like, You know, you're delegate, you're not writing as much code by hand. You need that extra set of eyes to make sure the code is good. Yeah, yeah. I think in this new era, if if it does dominate, we're going to have to get much better at code review and generally better at typing because we're going to be typing more pros than ever before.
00:58:48
Speaker
Yeah, although I generally speak

Innovations in User Interaction and Debugging

00:58:52
Speaker
now. Oh, do you? Yeah. So Warp has this where it's ah you know built-in voice integration where I just hold down the function key and I talk to Warp and tell it what I want to do because I don't like typing long English prompts. That's not fun for me.
00:59:07
Speaker
Fair enough, fair enough. It's possibly a little too Star Trek for me at this stage. It's crazy. Yeah. Okay, so we've talked a lot about ah developing. There is still the other huge ah use case of terminals, which is kind of diagnostic and debugging stuff, where you go into a remote server and try and figure out what went wrong.
00:59:27
Speaker
Are you doing anything special for that? So I think just by virtue of Warp's DNA as a terminal, it's an incredible product for this. And like this is still...
00:59:39
Speaker
how a lot of warp users um have their first and best experience with ai is like, they're like, okay, I want to figure out why this server is in a crash loop and not restarting.
00:59:53
Speaker
And their, their choices are to like go to the AWS, like CLI documentation page or to just ask warp to do it. And like, ah like having to like learn the documentation on how to use these CLI tools is just like hell compared to just trying to guide them with with English.
01:00:15
Speaker
And so Warps it is really, really good for this. um And so it is a really big use case. I think it's an overlooked use case too, in the sense like most of the um most of the hype is around code generation But a lot of what developers do is around like managing systems, diagnosing problems, setting up infrastructure, working with Kubernetes and Docker, depending on the type of engineering you do.
01:00:43
Speaker
And Warp is like is awesome for that. We want to make sure that we are the like the best at that. And so, yeah, that's a common use case. do you i mean Does that feed into things like your semantic search? you can't as As you said yourself, you can't really um vectorize all the logs you might need to go through to diagnose a problem.
01:01:05
Speaker
Yeah, i mean it's just you use a a different set of tools. Again, the terminal is a great workbench for this already, so it's like ah you know Warp knows how to use grep and find and set and awk and all of these things that you you know, I never really learned, but warp can do. And there's, there's super power tools whenever you're working with like a lot of textual data. So, um, again, it's, that's really a great place to have this extra level of abstraction rather than having, and don't know anyone right now who wants to learn like awk, but, but it's, it's powerful.
01:01:39
Speaker
And like, if you're like, I want to find a needle in a haystack here, I want to do some analysis on like the frequency of these types of errors. Um, You know, i or I want to use like some, you know, network debugging tool that I don't really, i could spend like, you know, few hours reading the manual or I could just ask Warp to do it. It's an easy choice.
01:02:00
Speaker
Yeah. One place I've started using that is I used to know it and I don't want to spend the next half hour remembering what I used to know about Ork in order to get this simple task done.
01:02:10
Speaker
Totally. Yeah. No, I have very vivid memories of like being on call at at Google and like there's a huge stream of logs and I'm looking for something and I'm like,
01:02:22
Speaker
fuck, how do I do that? Or said, like, what's the what's the control character that needs to go here? Just the amount of time I would waste doing that only to like, literally forget it like the next day ah is is if i like, I don't i'll miss that.
01:02:38
Speaker
Do you think we're going to have start having things like, so I don't know if you've got the equivalent here, but I know that claw code has like a clawed markdown file. which is work which tells it about the project so it doesn't have to keep recalculating it. yeah do you Do you have an equivalent to that? And do you think we're going to start seeing like production servers that ship with a here's how you debug production?
01:02:58
Speaker
Totally. So first answer is is like yes. So we have a we have a rules concept, which is like pretty analogous to Claude.md. Not exactly the same, but same concept where it's like you can sort of document um persistent context for the LLM.
01:03:18
Speaker
I think we will literally probably support Claude.md because it's like a popular popular format. It's just like a text file that gets added as context. um I think to your second question of like, will every server have that?
01:03:36
Speaker
Maybe. it's it's the ah The alternate is like doing it on demand. And so I guess I think one thing that is really changing is like how you think about documentation as a developer.
01:03:53
Speaker
You should be wondering if it makes sense to even have static documentation because like you know the problem with documentation is that it it gets out of date like like immediately, right? That's always been the problem with documentation. And so if you, um the alternate thing is to always do it on demand.
01:04:13
Speaker
Now there's like a cost and a latency with that, but you get around this like, is the are the docs stale problem? So i don' the short answer is I don't know. Maybe it makes sense to like,
01:04:25
Speaker
create this intermediate representation that is like pretty fresh. Think of it as like a cache of like your documentation. um but I, I also think you got to think about documentation totally differently in today's world as like, like it's just going to get out of date and there's, there's now a solution to that kind of.
01:04:44
Speaker
I suppose the interim is GitHub action that regenerates the config, clawed config file on every day. You can always do something like that. Yeah, I think it's some sort of like, you can think of it as an optimization, but I think that makes sense.
01:05:00
Speaker
Okay, there's there's one big thing that I wasn't sure when to insert into this conversation because it's kind of overarching to everything. Yeah. Especially, and you should have an especially good answer for this because you're a terminal person.

Security and Multi-Agent Management in Warp

01:05:12
Speaker
Yeah. What about security? What about the security of an LLM telling your terminal to run execute arbitrary code? Yeah, so this is definitely, this is this is a major risk.
01:05:25
Speaker
um So the way that we think about this is like, you as a developer or maybe as a company that is is ah buying warp for your developers should have control over like how autonomous you allow an agent to be.
01:05:44
Speaker
And so by default, like the default agent, the way the agent ships in warp is today effectively like you're in the loop like for kind of everything.
01:05:56
Speaker
And if you don't want to be in the loop, yeah have to say you have to you've you've got to say to the agent that um I want to allow you to make code diffs. I want to allow you to run commands.
01:06:09
Speaker
And then there's a second layer on top of that, which is like, and by the way, never run RM without my explicit approval, never run pseudo without my explicit approval, you know, never run and curl. And so we let you configure that.
01:06:26
Speaker
That's one approach. That's what we have today. i think a second approach that we're looking into that, you know, exists in some other tools is is more of a sandboxing approach where, um,
01:06:38
Speaker
you know warp Warp runs on your local machine and it runs effectively as you. And so that's kind of dangerous, just inherently. And so you know you could ah create more guardrails by having it run in some sort of sandbox environment, whether it's a Docker container or on a remote machine or something like that.
01:06:56
Speaker
And so that's that's also, i think, a a decent solution here. There's another level of solution, which is like, more around like data privacy and security, which I think people have good concerns around, which is like, um you know I'm worried about leaking secrets and them getting into like the LLM's training set. And so for stuff like that, we have multiple levels of things that we do. So we we basically run bunch of regexes that are also controllable by you or by your company to scrub anything that looks like a secret before it gets...
01:07:31
Speaker
sent to an LLM. All the LLMs that we use, we have um what's called zero data retention policies with where they are contractually not allowed to train on any the data or store it for more than like a certain period of time. I think it's like 30 days.
01:07:48
Speaker
And so we, you know, we do what we we really do the best that we can to make it totally like controllable and then like to prevent yourself from shooting yourself in the foot and then also to make it transparent um around what what we're doing.
01:08:03
Speaker
But, you know, that's it's a powerful, with kind of great power comes, you know, great responsibility. And so you have to, you got to kind of know what you're doing. It's a power tool for sure. Yeah, don't click auto accept blindly.
01:08:16
Speaker
Yeah, yeah, I would, you know, ah i I personally run it where I allow it to just make code changes because I'm always running in a Git repo and it's like, It's all because of that. There's a sort of sandbox around it.
01:08:30
Speaker
um Running commands is maybe a little bit scarier, but like ah you know it's I've never had it actually you know delete my hard drive or anything.
01:08:41
Speaker
We have all these other guardrails in place, but you should be careful when using it for sure. It's power-to-one. We'll be back in another year to see if you've accidentally deleted your hard drive.
01:08:52
Speaker
I think but the irony is I think I'm more likely to accidentally delete my hard drive than the agent is I totally, yeah I like, I'll put RM space slash space. Like I'll forget to escape a space like, and then whoops, like that's, yeah that's more likely to happen. i think, um, I've done it a couple of times. So my batting average is worse.
01:09:10
Speaker
Yeah. Yeah. this so That makes me think that, okay, yeah, I could definitely see dev containers or something like that being the integrated part of this. And that reminds me of one more thing I wanted to ask you about, which is there's a certain implied linear workflow here. You and one agent working on one change, but you must have thought about parallelizing work streams.
01:09:33
Speaker
Oh, this is a core part of of the new version Warp. It's it's um it natively supports multiple agents running at once. And so what I mean by that, so that the model in in Warp, and I think this is ah yeah, you have to parallelize because otherwise you're just sitting there watching an agent work, is that every terminal session can effectively support its own agent running at a time.
01:09:55
Speaker
And what that means is like you can, um just by splitting panes or creating new tabs, effectively get up to like N agents running. And then we built...
01:10:07
Speaker
some really cool UI around being able to see what all the agents are doing at a glance and being able to get like system notifications or in-app notifications when an agent needs your attention. ah And I think that that's, that's definitely the world today.
01:10:22
Speaker
The world tomorrow is that, you know, you don't even, it won't necessarily be a human that's launching the agent, it will be an agent launches in reaction to some sort of system event.
01:10:34
Speaker
And so, um you know, managing that, just like the management and orchestration of these agents becomes an even bigger a feature that we need to get really right.
01:10:47
Speaker
how how ah How do you do that? i mean, what's your design? What's your approach? What are you foreseeing? So I'm foreseeing something... and there's a bunch of different UIs that might support this. like One would be something that yeah know kind of looks a little bit like Slack, where um you can see all of the logs of what agents are doing across your system by some sort of like topic or something like that.
01:11:14
Speaker
Another design for this looks a little bit more like GitHub Actions, where you know it's kind of more of like an if this, then that type architecture for what your agents are doing. um we're we're trying to figure out what we think is like the best bird's eye view for your agents, but also for your team's agents, for agents in your infrastructure, ah for managing, controlling. And then I think really importantly for correcting and like hopping in and helping ah at least for the near term, because, you know, these agents are, they're fallible. And so, you know, you're going to, as a developer want one really nice interface for,
01:11:55
Speaker
ah sort of going and correcting what they're doing. So that's another important aspect of it. Yeah, a terminal with built-in orchestration layer. It's not a terminal. I'm going to keep repeating that. I mean, it is a terminal.
01:12:08
Speaker
But like it's ah fundamentally, what we are trying to build here is not a terminal, not an IDE. it's It's a new type of um tool that we call an agentic development environment where it's really geared towards ah launching, managing, correcting, working with agents.
01:12:27
Speaker
It looks a lot like a terminal that works as a terminal still, just to be clear. So I don't want to confuse people too much. But that's that's mainly because we think that the terminal form factor is really is is basically what you want for this.
01:12:41
Speaker
yeah. so ah yeah Okay, then final speculation question. yeah like There are lots of people and lots of money going into figuring out the right way we will interact with these agents.
01:12:53
Speaker
yeah and Do you think we're going to see one winner, which you must hope is you? Or do you think it's going to be like everyone has a different editing style? We're all going to have different agent styles?
01:13:05
Speaker
Awesome question. um i think historically, if you look at developer tools, developer products, it's been very fragmented. um developers have a lot of personal choice in configuring their like cockpit, so to speak.
01:13:24
Speaker
Um, even And when we go and sell warp to big companies, the general attitude today matches that, which is that ah no like CTO or VP of engineering wants to force a tool onto an engineer. There's a general...
01:13:44
Speaker
just like a general agreement at the moment that engineers are best positioned to choose the tools that will make them most productive. And so what you see at companies is is companies kind of like getting licenses for all different kinds of tools, whether it's like, you cursor plus warp plus cloud code, like they might allow their developers and to pick the one that they feel like makes them most effective.
01:14:09
Speaker
Um, So that's what I see right now. Will well there... I guess what would change that is if there emerges some sort of like network dynamic where there's increased value for people using these tools together.
01:14:24
Speaker
Or if um some tool... like really like, I think it's a network dynamic actually. So if we're one of these tools that does the best at like learning your team's um infrastructure or has a higher switching costs or something like that.
01:14:42
Speaker
But as long as these tools are very easy to switch between, i think it will stay as like developers are best positioned to pick the thing that's going to make them most productive. ah So I don't really see at the moment, like a winner take all thing, but that and that could change. And we're not we're not like trying to force that world either, to be honest. like We want developers to opt in to using more because they find it to be the most powerful and productive way of them doing agentic development. But I totally understand that some developers are going to want to use cursor or use cloud code or whatever.
01:15:19
Speaker
Well, I hope it stays that way. I hope you do well, but hope you keep the pressure on to keep doing well. Yeah, exactly. It's like, it's better. I think it's better for the developers if there's like a competitive dynamic where all of these companies and open source projects that are trying to produce stuff that helps developers ship better, that they that they keep competing and keep pushing the state of the art forward. I think that's healthy.
01:15:48
Speaker
Yeah, yeah. On that note, I should probably have to leave you to go and keep building the future. Zach, it's been great to talk to you. Chris, thank you for having me. Could I give a ah ah promo code for people who want to try Warp?
01:16:01
Speaker
I think I've made you sing for your supper over well over two hours now in total. So i think you can do that. Okay, so if if people have listened to this are interested in trying out Warp, it's at warp.dev, and we're giving away a free month of Warp Pro, which comes with extra AI limits so you can do more agentic development.
01:16:21
Speaker
The promo code is warpdev.com. s twenty five Let me say that again. WarpDevs25. WarpDevs25.
01:16:32
Speaker
All capital, no spaces. and if I'll stick a link in the show notes for you. Thank you. That'll be helpful. I think we could have picked a simpler one, but whatever. Yeah, this was awesome. I i really enjoy chatting with Chris. It's pleasure. And I really look forward to getting you back next year and finding out what you've cooked up by then.

Episode Reflection and Listener Engagement

01:16:49
Speaker
Yeah, let's do No one, absolutely no one knows what it's going to be, right? Who knows what's coming, yeah. Interesting times. Cheers, Zach. Yep. Thanks, Chris. Thank you very much, Zach. I'm going to let him get away with the plug at the end because I've probably had about four or five hours of his time by now.
01:17:04
Speaker
But just to be clear, this wasn't a sponsored episode. I just find Zach an interesting guy. I hope you did too, but let me know in the comments. um You know, like I said at the start, it's sometimes hard to hear the voices of real developers over the hype of Silicon Valley at the moment. So let me know what you're thinking.
01:17:23
Speaker
I'd be interested in hearing everyone's take on how much you think AI is going to change the game. I really don't know. i know it's going to change it somehow. I don't know how much or how.
01:17:34
Speaker
So give me your two cents. Whether you're commenting on that or not, if you've enjoyed this episode, please take a moment to like it, rate it, share it with a friend or a colleague, and make sure you're subscribed because we're back soon with another episode.
01:17:49
Speaker
But until then, I've been your host, Chris Jenkins. This has been Developer Voices with Zach Lloyd. Thanks for listening.