Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
InfluxDB: The Evolution of a Time Series Database (with Paul Dix) image

InfluxDB: The Evolution of a Time Series Database (with Paul Dix)

Developer Voices
Avatar
2.1k Plays1 day ago

How hard is it to write a good database engine? Hard enough that sometimes it takes several versions to get it just right. Paul Dix joins us this week to talk about his journey building InfluxDB, and he's refreshingly frank about what went right, and what went wrong. Sometimes the real database is the knowledge you pick up along the way....

Paul walks us through InfluxDB's evolution from error logging system to time-series datasbase, and from Go to Rust, with unflinching honesty about the major lessons they learnt along the way. We cover technical details like Time-Structure Merge Trees, to business issues like what happens when your database works but your pricing model is broken.

If you're interested in how databases work, this is full of interesting details, and if you're interested in how projects evolve from good idea to functioning business, it's a treat.

--

Support Developer Voices on Patreon: https://patreon.com/DeveloperVoices

Support Developer Voices on YouTube: https://www.youtube.com/@developervoices/join

InfluxData: https://www.influxdata.com/

InfluxDB: https://www.influxdata.com/products/influxdb/

DataFusion: https://datafusion.apache.org/

DataFusion Episode: https://www.youtube.com/watch?v=8QNNCr8WfDM

Apache Arrow: https://arrow.apache.org/

Apache Parquet: https://parquet.apache.org/

BoltDB: https://github.com/boltdb/bolt

LevelDB: https://github.com/google/leveldb

RocksDB: https://rocksdb.org/

Gorilla: A Fast, Scalable, In-Memory Time Series Database (Facebook paper): https://www.vldb.org/pvldb/vol8/p1816-teller.pdf

Paul on LinkedIn: https://www.linkedin.com/in/pauldix/

Kris on Bluesky: https://bsky.app/profile/krisajenkins.bsky.social

Kris on Mastodon: http://mastodon.social/@krisajenkins

Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Recommended
Transcript

Introduction to Time-Centric Data Problems

00:00:00
Speaker
There's a whole class of data problems where the most important questions begin with time. I'll give you my most heart-stopping example. The server crashed at 3 a.m.
00:00:11
Speaker
So what happened at 2.59?
00:00:15
Speaker
Or a happier example, we always make more sales on a Friday, but when should we start scaling up to handle the load? Time is often the first criteria you pick when you're answering data questions.
00:00:28
Speaker
And if you've got a lot of data, you can easily be looking for a specific time range in several billion rows. And that kind of thing doesn't scale trivially. So for a time-centric data set, time-centric queries, it starts to make sense to look at a time-centric database or a time-series database, to use the jargon.

Interview with Paul Dix on Time Series Databases

00:00:51
Speaker
But how do you build one of those? What makes a database a time series database? It's an interesting engineering problem. So I asked an expert, I asked Paul Dix of InfluxDB to come along and explain how you build a time series database.
00:01:07
Speaker
And happily, he goes into great detail about the ins and outs of it. Things like time structured merge trees and what they are. And then we get into some juicy questions I wasn't expecting answers to.
00:01:19
Speaker
Like, what happens if you build some technology that your customers like, but you think you need to do a massive rewrite if it's going to grow? What do you do if you get the technology right, but you get the pricing model wrong and you have to course correct on that?
00:01:35
Speaker
What if everything is so successful that you get pulled into management when really you kind of wanted to stay writing code? I got talking to Paul about all of that, and unusually for a CTO coming out of Silicon Valley, he was incredibly straightforward about the mistakes as well as the successes. so He's very frank about what he had to learn the hard way, things you'd like to know before you go down that road.
00:02:01
Speaker
Things from, should we write it in Go or Rust, to what happens when the logical pricing model is the wrong one? to the pros and cons of using Parquet files.
00:02:14
Speaker
If you're interested in how databases get built, there's a lot here. And if you're interested in the problems you face when the technology becomes a whole business, there's even more.
00:02:26
Speaker
I'm your host, Chris Jenkins. This is Developer Voices. And today's voice is Paul Dix.
00:02:43
Speaker
Joining me today is Paul Dix. Paul, how are you? I'm good. How are you? Very well. Glad to see you. Coming in from New York City? I am in Williamsburg, in Brooklyn.
00:02:55
Speaker
In Brooklyn. Do you know, New York is... I live in London. New York is the only place I've been to in the world that has genuinely overwhelmed me. Really? Yeah. Coming from London. Coming from London. it's i I showed up in time, first time I went there, I showed up in Times Square and I came out of the underground, the subway, and I looked out and I go, oh my God, this is big.
00:03:17
Speaker
I mean, Times Square is a bit overwhelming. i mean, most most New Yorkers actually avoid Times Square. so Just the tourist thing, huh? Yeah. Well, I mean, I seem like, are you dropping into Piccadilly all the time? Yeah. Okay. Yeah. That's fair. That's fair.
00:03:33
Speaker
So i'm I'm going to try and thread the needle here from overwhelming the senators with being overwhelmed by too much data. Let's talk about time series. Nice. That's a little thin, but sure. Let's get into time series. We'll just cross any bridge that's available. We'll go with that.

Advantages and Structure of Time Series Data

00:03:51
Speaker
So you're the CTO of Influx? um one of the co-creators of InfluxDB, right? Yes. Yeah. so i'm the I'm the co-founder and CTO of Influx Data is the company and InfluxDB is the project, right?
00:04:06
Speaker
ah So I was the founding CEO. And after about three years, I hired in my replacement, Evan Kaplan, who is now the CEO, and we still work together today.
00:04:17
Speaker
Nice. ah Nice. It's not always a relationship that goes well. Yeah, no. I mean, it's had its up and ups and downs, but it's been really great over the years. I want to get into this because you seem to have spent a decent part of your kind of Silicon Valley-esque tech management career avoiding being sucked all the way into management, keeping your hands on the tech.
00:04:38
Speaker
But we'll get there. But first time series. But first time series, yes. Okay, so I want to start this from the naive point of view, and you can wise me up. Up to a certain number of rows...
00:04:49
Speaker
I reckon, let's take a small number, let's say 10,000 rows, I can stick 10,000 rows in Postgres, create an index and a time column, and I am happy and no one can tell me different.
00:05:02
Speaker
So what point what what is it the time series database does that that doesn't? Is it just scale or is it something interesting elsewise? Yeah, so I mean, this is funny. Like I talked about this the very first time I gave a talk on InfluxDB.
00:05:17
Speaker
was literally the thing I said. it was like, you know, you've got a relational database, you've got a time column, just index it by that, slap a time column on it, or, you know, organize by that, and and there you go.
00:05:28
Speaker
And scale is obviously the first problem, right? When you think about relational databases and OLTP databases, transactional databases, they're inherently designed to handle a certain amount of scale. right and they're you know That's for bank transactions, for purchases on an e-commerce website, and all safer but like you know, a handful, a few dozen companies, most people don't have to deal with a transactional workload scale that is too crazy, right?
00:06:00
Speaker
ah They don't, they certainly don't have to deal with a billion rows a day getting written into the database, right? Now, of course, if you're Amazon, that's not true. If you're Stripe, that's not true, right? And they have all sorts of specialized stuff to deal with that.
00:06:13
Speaker
But From a database perspective, most transactional databases are not designed to handle a billion rows a day getting getting written into the database and trying to organize that data.
00:06:25
Speaker
And in the time series use case, a billion rows in a day really isn't that much data, right? backing time so backing up what is a time series what is time series data and what is a time series database, right?
00:06:39
Speaker
So time series data is really just, ah it's at its simplest level, a collection of time-value pairs ordered by time, right? And then the thing that adds a bunch of complexity by the and that, and the values could be whatever it is you're measuring, right? They could be floats, spools,
00:06:57
Speaker
Integers, it could be strings, it could just be raw bytes. But it's basically some piece of data that you're organizing by time. And then... i'm just going to pause you quickly there. and So you're implying it's always primitives rather than structured objects?
00:07:11
Speaker
Well, so then you layer the structure on top, right? So the primitive, the absolute like core primitive is some kind of value and a timestamp and the ordering, right? The the time-based ordering.
00:07:26
Speaker
The structure you layer on top, most time series databases have something called either tags or labels in Prometheus, right? Which basically attach metadata to all of this time series, right?
00:07:39
Speaker
Okay. So... But realistically, like it the way I view time series is, to me, time series is basically just a way of representing data. And any database can give you back a time series, right? You can query against the data. And as long as you have time-stamped data, you can query it back and represent it over time, right? And a lot of people, when they think about time series, they think about metrics, which are basically pre-computed time series, right? You say you're going to take samples.
00:08:10
Speaker
every 10 seconds or every minute. You're going to measure a thing. it could be an average, it could be a percentile, it could be a count, it could be you know something else. You're going to measure that thing and you're going to attach a bunch of metadata to it.
00:08:23
Speaker
and that And then you store that and then you want to query against that. So in its purest sense, a lot of people think that's what a time series database is. And to some extent it is, but In my mind, you can pull time series out of um really any kind of observational data that you can think of. right log If you take a bunch of log files, you could...
00:08:44
Speaker
produce on the fly basically an infinite number of time series out of a bunch of log files. right If you had an Apache log, it could be a request to a specific page, it could be 200s, requests from specific addresses, from specific users, rolled up on all sorts of different levels, like by different geographies.
00:09:03
Speaker
right There are a bunch of different ways you can slice and dice this data, but the end result is what you're looking at is something that's changing or not changing over time, right? And you're doing this generally because you want to either put it on a graph so a human being can look at it, yeah or you want to build a monitoring system that looks for some sort of change that is important, that's going to trigger either an alert to it a human being or it's going to trigger some change in a system, some automation that you're doing, right? And
00:09:35
Speaker
Server monitoring is the easiest one for people to think of, for programmers to think of. right So if you think of like you know you're operating a service inside of Kubernetes and you want to scale up the number of containers or scale down the number of containers, what you'll do is you'll track a number of metrics that represent kind of the load on the system, red or whatever you want to track.
00:09:55
Speaker
And then you so you have the automation, look for the change, and then trigger up the scale up or the scale down based on that.

Efficient Data Models: Polling vs. Push

00:10:02
Speaker
That's interesting. Does that imply there's like some kind of push-based streaming in the database? Or are you just saying, make it fast enough that you can poll that query over and over? that says Both approaches are valid. And I've worked with time series data where we've tried both. like ah Before in FluxDB, worked for a fintech startup, and we were doing like real-time pricing in the corporate bond market and all sorts of other stuff. And the original solution we had was a push-based model where...
00:10:30
Speaker
people connect to the database and subscribe to these are the things I want to listen to and it will push it out. And actually what we found was that was terrible in terms of performance and scalability. It was great in terms of like guaranteed latency or not even guaranteed, like average latency of getting a notification.
00:10:48
Speaker
But a much better way to structure the whole thing from operations, reliability, scalability and predictability kind of standpoint was actually a polling model.
00:10:59
Speaker
right This was a case where we didn't need, you know, everybody talks about like real time, which I put in air quotes, because real time is highly dependent on context, right?
00:11:12
Speaker
If you're tracking, say, the temperature changes from some for some massive boiler, right? real time on that thing, like the the temperatures don't change that fast because it's this huge, well, it's this huge thing that's just like, if you monitor that once every, you know, 10 milliseconds, it's not going to matter. Like, yeah you know, you can take a measurement every 10 minutes and you're totally fine, right?
00:11:38
Speaker
And real time is also dependent on like what's going to happen, right? If you're triggering an alert to a human being, you know If you take 100 milliseconds to trigger that alert or five seconds, generally it doesn't really matter if it's going to a human because their response time is going to dominate. right the yeah The time of the system is not going to matter. But if you look at use cases like high-frequency trading where they're making trading decisions, right one, they don't use like commercial time series database solutions. right Essentially, it's like to every single one, they like custom write their own stuff because their expectations are, is like we have to like get these events and respond to these events in like less than 50 microseconds.
00:12:22
Speaker
right So their perception of what constitutes real time is very, very different. So it's all like use case dependent, and it also depends on like where you're collecting the data and what you're doing. i think...
00:12:36
Speaker
That's actually going to become a lot more obvious when people look at use cases in like robotics, right where in some cases like you need the database and the collection and the actual action very, very close to the physical system that's operating versus you know a cloud-based solution for server monitoring where it doesn't matter if you're going across half a continent, you're going from you know the west coast of the United States to the east coast to send your data,
00:13:03
Speaker
because a server monitoring system doesn't need to respond in the tens of milliseconds that robotic systems need to respond in, for example. yeah yeah and So this is like all getting pretty far afield from the original question, which is, what is time series data? What is a time series database? So why

Time Series Database Query Types and Retrieval Needs

00:13:21
Speaker
a time series database?
00:13:22
Speaker
So scale is the most obvious one, right? There's that. There's... the there's Basically a set of queries that you want to do. Like time series queries are almost always, well, there's two kinds of time series queries. One, which is not really time series, which is get me whatever the last value measured was for this sensor or this collection of sensors, right? And that's something you kind of want to optimize for.
00:13:45
Speaker
And then there's another, which is, show me, you know, get me the time series data for this sensor, what these collections of sensors, and either you want just the raw data, or you want to compute some sort of summary against it, right? The summary could be the average, the p ninety nine or you could want, like, you know, the min, the max, the mean, like, all of those to come back.
00:14:07
Speaker
And essentially, like, In our use cases, almost always people want those to come back within tens of milliseconds or hundreds of milliseconds at most, right? These are not...
00:14:21
Speaker
analytical report-style queries where you're OK with it taking tens of seconds or minutes to go, right like like where you like submit a job and it runs in this background system and it takes forever. right The expectations in our use cases of time series, which are a lot of sensor analytics, server monitoring, custom metrics, network telemetry, satellites, rockets, like all these things, their expectation is they can write the data in,
00:14:48
Speaker
And within hundreds of milliseconds, it is queryable. And then when they query it, those queries return, like many of those queries, most of those queries return in tens of milliseconds or hundreds of milliseconds.
00:15:01
Speaker
And again, that's threshold where for a human, it's like you can't tell the difference between it being instant. Yes, exactly. yeah That's all driven yeah by the the kind of like user experience that you want to have within your app or within your dashboarding solution, or driven by your monitoring system where you know you're have all these queries operating. so and I would say ah most like a lot of monitoring systems operate on this basically polling model.
00:15:29
Speaker
right You have like this condition that you're checking for, and you just choose a frequency on which to check for it. right and usually the So if you're checking for something once every 10 seconds or once a minute, right usually what you're doing is you're actually checking for something recent, but you're not just checking for a lot of times like that last 10 second period or that last minute period.
00:15:52
Speaker
What you're actually doing is you're querying a little bit further back in time so you can get a little bit more context. right to say, like, what's going on, right? So you're not just querying, like, a point in time. You're actually querying ah range, and it's a moving range of time of, like, give me the last five minutes worth of data for this thing so I can run some computation and see if there's a thing I'm looking for.
00:16:12
Speaker
but yeah Yeah. if If the CPU is over 90%, then show me what it was doing for the past five minutes so I can understand what's happened as it's leading up to 90%. Yeah, and those yeah those monitoring and alerting queries, you know a lot of times those will represent more of the query workload in a production system than all the other queries combined. right The other queries are basically, usually they're like dashboards that people are looking at that are refreshing on an interval. right Usually in the refresh intervals, one second, five seconds, 10 seconds, 30 seconds, maybe at most.
00:16:48
Speaker
Okay, so take me under the hood. Does this mean that you've got a kind of two-tier, two-part architecture, one that's dealing with a hot set and one that's dealing with the archive stuff on disk?
00:17:00
Speaker
Ideally, ideally, yeah. Like, because, ah again, because of the performance expectations, the hot you expect the hot set to be in RAM, but keeping that data in RAM is super expensive, right? RAM is the most expensive place you can put it.
00:17:15
Speaker
And the other the other kind of, like, problem about this is that the hot set that you have is constantly changing. right it is like It's basically like, there are only two hard problems in computer science naming things in cache invalidation. And basically this is like constant cache invalidation. Your cache gets invalidated literally every second as new data is coming in.
00:17:35
Speaker
so There's the scale piece, which is just the volume of data, but then the other piece is this managing of the data lifecycle between getting it in and putting it in RAM so it's fast for query and moving it off to either a locally attached disk or object storage, ideally, right? Because object storage is... Now it is essentially the the default de facto place where people are going to store data and it's like cheap and scalable and all this other stuff.

Managing Data Lifecycle and Storage Efficiency

00:18:02
Speaker
But then you need to also manage, you know are you going to use local disks for caching data? And if you're going to do that, how do you move data in and out of that local disk cache and make the whole thing efficient?
00:18:14
Speaker
so and then Other pieces of this use case are, you know, there's, you know, do you want some built-in downsampling? So it's very common to, you know, collect very high precision data.
00:18:27
Speaker
and the value of that data as it's recent is much, much higher than as it gets older. And as it gets older, you maybe don't need the same level of precision. ah yeah. So if you're getting data every 10 milliseconds, eventually when it gets to like a year old, you only care about every 10 seconds.
00:18:45
Speaker
Right, or or minute or minute or hourly, right? Yeah. Exactly. makes sense. And basically, what that means is, like, if you're collecting every 10 seconds and you change it to one-minute summaries,
00:18:57
Speaker
you know, you're reducing the amount of data you need to worry about by a factor of six, right? Usually that's actually not quite the case because people will keep multiple summaries. So you don't quite get that clean of a, you know, reduction.
00:19:12
Speaker
But usually i think what people will do is they'll have one minute summaries or five minute summaries or 10 or 15 minute summaries. And then one hour summaries is basically like the the last tier, right? Yeah.
00:19:25
Speaker
Yeah, you're making me, because I was expecting the technical meat of this to be actually a B-tree index won't cut it. We need specialized time-based indexes.
00:19:36
Speaker
But you're making it sound like actually the biggest part of the implementation challenge is kind of data logistics, like just managing that data in different ways. So there's that. I mean, there's also the the underlying file structure, right? For InfluxDB1, we wrote our own storage engine from scratch to be able to deal with this. And again, the the reason for that is ah thing one of the things that makes the time series use case so tricky is your data arrives, what I would like to say, wide. Basically, you have a single data point for a single point. You have now data arriving, but whenever you query...
00:20:15
Speaker
you're not querying just now. but You're querying down, like this is the wide, you're querying deep. So when you query, you're saying like, i want you know the data for this sensor or an aggregation across this group of sensors for the last 24 hours. right So the data arrives like this, but you're querying like this.
00:20:37
Speaker
And the trick is... how do you optimize for that? right Because if you if the data arrives like this, the dumbest thing you could the easiest thing you could do is like we're just going to append it to files.
00:20:48
Speaker
And we're going to create new files. And basically, like say we just like open up a file every 10 minutes, and we append the data into it. And then once the 10 minutes is done, we write that file to disk, or we write it to object storage, right and we kill the the log for that.
00:21:02
Speaker
And then we just keep going. and then whenever but the problem And you can scale infinitely there, and it's great, and you can ingest all the data, and it's super cheap. And then the second you try to query it, the query is just going to be unbelievably slow, right? Because then to query it, you're going to look at the time, and then you're going to say, okay, I have every 10-minute file for that block of time, but then you have to churn through all of it.
00:21:25
Speaker
So the trick is you have your data arriving like that, and you need to reorganize the data on disk so that you can query quickly to that, right? And in some sense,
00:21:39
Speaker
it's an indexing problem. But as anybody who's worked with indexes in a database knows, like the more indexes you add and the higher the throughput the table is, the more painful it becomes. right You end up using more RAM and more CPU to maintain this index. and like So those are those are the problems. right So it's very much like a data lifecycle logistics problem, but it's like,
00:22:03
Speaker
the The storage system that you end up designing for it is very, very different, I think, than, you know, let's just use a B-tree. Like, actually, in version 0.0,
00:22:15
Speaker
inversion zero dot nine of InfluxDB.

Technical Insights into InfluxDB's Development

00:22:20
Speaker
So version 0.8 of InfluxDB, we were using LevelDB, which is a LSM tree built by Google open source.
00:22:27
Speaker
RocksDB was a popular fork of it that was created around that time that was at Facebook. So in version 0.9, we switched over to a B-plus tree, a copy-on-write B-plus tree called BoltDB, written by Ben Johnson, because it was written in Go at the time.
00:22:42
Speaker
It was great, rock-solid implementation whatever, but the performance of it was absolutely abysmal. like It did not work for this, which is what led us to creating our own storage engine that was essentially like...
00:22:57
Speaker
The storage engine we built is called the TSM tree. ah just so it's time-structured merge tree. And the reason we called it that is because it's essentially it's inspired by the LSM tree. It's like the LSM tree, but it's optimized for this specific time series use case.
00:23:14
Speaker
um Can you explain how it works? Oh... i Without the the use of diagrams. ah Yeah, yeah. Just waving my hands around. yeah no you can't even do that because this is on Spotify as well. Oh, God. All right.
00:23:29
Speaker
So so that that version of the database is, I've said this before, it's actually like two databases in one, right? So the first part is basically the time series part of the database, and that's that's that TSM tree thing, right? So essentially, you write data in...
00:23:47
Speaker
it goes into a write-ahead log, and we also keep it in RAM in ah in a different format so that it's fast for query, right? And then periodically, what we will do is we will snapshot all of the stuff that's in RAM into a TSM file, right?
00:24:05
Speaker
So basically, like this is like the first level, is this snapshot in TSM file. ae And then in the background, and basically all the the data in the snapshot in TSM file is organized by time value pairs, where we've taken ah the data model for InfluxDB. is You have a measurement, you have a database.
00:24:26
Speaker
In version one, you had something also called a retention policy, which I could talk about in a second. But think of those as like one concept. that You have this organizational bucket called database and a retention policy. Within that, you have measurements,
00:24:40
Speaker
And measurements have rows, and each row has a tag set, which are key-value pairs of strings, fields, which can be int, bool, float, or string.
00:24:52
Speaker
And then you have a timestamp, right? yeah ah And basically, this series a series, an individual time series, is basically the measurement name, the tag set, and the field name.
00:25:04
Speaker
right And so basically, if you have that identifier, you will then have a value and a time. That's a single like capture, right? and then So what we do is we organize, in the underlying storage engine, we give the measurement name tag set and the field name like a UN64 identifier. And then we organize that data in the TSM file by that ID, which is the time series, and then value time pairs sorted by time.
00:25:34
Speaker
right okay And so basically when the data come when we write that file, we organize all the data by that. And within the the value time pairs sorted by time, we break them down into blocks of 1,000. And the reason we do that is so that we can apply compression to that block.
00:25:53
Speaker
right So we actually we compress the timestamps separately from the values. So the timestamp compression ah uses like zigzag run length encoding.
00:26:05
Speaker
like It's been basically kind of like a form of like double delta encoding. um And then the value block gets compressed based on the data type. right So is it a float?
00:26:16
Speaker
If it's a float, we use the gorilla style compression that was Facebook had written about back in whenever that was, 2014. ah ah If it's an int, we use the double delta run length encoding.
00:26:32
Speaker
ah If it's a string, we use snappy compression. That's interesting that you're actually being that granular about the types of compression.
00:26:44
Speaker
Yeah, yeah. i mean Well, I mean, most... Any analytic file format will actually do that same kind of thing, right? Parquet does this. It has like compression for the different types.
00:26:56
Speaker
ah the latest The latest bit that people are trying to do with new file formats is basically how to decouple the actual file format from the fact that you can have different kinds of encodings and different compression schemes and kind of like...
00:27:09
Speaker
do How do you decouple those two? Anyway, that's that's a different subject, not for this. ah Right. but ah So then the other part of the the database structure is an inverted index.
00:27:22
Speaker
right So usually you use an inverted index in document search, right where you have... um You have documents which you have an identifier, right like an UN64 identifier, and then you have the terms in the document, and basically like the terms map you map which terms appear in which document. right So you have a term like the the word developer.
00:27:47
Speaker
And you say, OK, now you have a posting list for for which documents that term appears in. And the posting list is basically just the IDs sorted by the ID number, right?
00:27:58
Speaker
This is the one that looks most like the index in the book, right? Where you look up the word file and it tells you the page numbers. Yeah, so then what that what that means is you can do set operations on it, right? So if you say, like, oh, I want to search for documents that have developer voices, right, both developer and voices in it, then you get the two posting lists, you do an intersection, and then you have the of document IDs that those appear, right?
00:28:20
Speaker
Great. So the thing we did was we realized, well, the metadata that describes time series um but could be represented in an inverted index. right So that's the tag set.
00:28:31
Speaker
right the The measurement name, the tag set, and the field name. So the tag set is, remember, the key value pairs. So it could be like, in server monitoring, could be the region, availability zone, the host name, the service it came from, you all sorts of things like that.
00:28:47
Speaker
yeah And basically what that means is So when we're ingesting the data, we create this inverted index. And that's ah that's a separate file that gets created. So we call this the TSI, time series index.
00:29:00
Speaker
Initially, when we built the ah the the TSM tree, this inverted index was entirely in RAM. And we would build it on the fly when the server started up.
00:29:11
Speaker
right Basically, when the server started up, we would look at all the TSM data, and we would actually build that inverted index on the fly. so And this was true basically for the first year, from 1.0 of InfluxDB, which was in September 2016, version... was like 1.1.
00:29:30
Speaker
it's like one dot one i think it was a release in like November indexing. Right?
00:29:34
Speaker
or twenty sorry twenty seventeen that had the disk-based indexing right so ah Essentially, what that so what that means is like you have an index where you say, okay, this tag host A ah appear has these time series that it maps to.
00:29:52
Speaker
And region West has these time series that it maps to. And you know like whatever whatever the tags are that you represented. So when a query comes in,
00:30:04
Speaker
We look at you know what measurement are you selecting from. You can think of a measurement like a table in a SQL database. yeah And tags and fields and time, those are just columns. They're columns of specific like types.
00:30:18
Speaker
um So then we look at the where clause of the query, and we look at the tags that you have in it. right So we say, oh, we look at those tags, and we look up the posting list for the measurement name. We look up the posting list for the tag key value pairs, and we do intersections or unions depending on write the what the structure of the where clause looks like looks like, what the predicate looks like.
00:30:43
Speaker
you know Are you saying and or whatever? right yeah ah So we look at the inverted index to get the set of time series IDs, and then we go to the TSM structure to look up this ID, this time range, give me the data.
00:31:00
Speaker
And the and the the The TSM structure is, like I said, the what happens is that data arrives as a file. And the thing I haven't talked about is the compaction process, right? So the problem, this is basically the trickiest part of the whole thing, right? All this other stuff is like easy.
00:31:19
Speaker
The trickiest part is, That, you know, you get this set of data and say we're going to like snapshot, you know, 10 minutes worth of data. And what we're going to do is we're going to write one TSM file for that 10 minute period.
00:31:33
Speaker
And then we're also going to write an index file. And the problem then is in the background, what we have to do is we have to combine the index file and the TSM file with other regions of time, right? so we basically, like, if we capture this 10-minute window of time, what we want to do is, say, once an hour, we want to rewrite the six 10-minute windows into a one-hour block, right? And I'm simplifying it because it doesn't actually work like this. The problem is, like,
00:32:03
Speaker
In InfluxDB, you can actually like load data for any point in time, so it's not guaranteed that the 10-minute window of writes that you just captured are actually for the the data that describes that 10-minute period of time.
00:32:18
Speaker
The timestamps on the data could be for any period of time. Oh, right. There's no guarantee that the data arrives in time order. like Some data will show up a bit slow, even though... like That's right....a sensor that only connects to the internet once an hour may have an hour's worth of backlogged data.
00:32:37
Speaker
there's there i mean There's that. there's like The server could go down. the the thing that's collecting the data could slow down for some reason. right there There's always lagged data collection, even in a system that is... like you know performing well, there's going to be some delta of, you know, some some writers and some collectors are not exactly in time with the others, right? You're not going to have a strict time ordering of the data coming in.
00:33:06
Speaker
Like, if you did, this problem would be a lot easier. It'd still be hard, but it'd be easier. Okay. so So basically what what you have to do is, writeip we're going from the 10 minute to an hour.
00:33:20
Speaker
We're going to take the six block, 10 minute blocks of time and rewrite them into an hour block of time. And it could be many files that represent all this, but we're going to rewrite them.
00:33:32
Speaker
And what what we're doing is we're rewriting them so that the data is now organized by time by series time value pairs. right So think about Think about data where you're sampling once every 10 minutes.
00:33:46
Speaker
In a 10-minute window of time, you are going to have 10 samples, right? When you're sampling every minute. which Yeah, when you're sampling every minute. Yeah. Right? so you're going to have 10 samples.
00:33:57
Speaker
And basically what that means is your compression is only going to work against 10 samples, right? Because, again, we're organized by the time series and the thing, and we block it by time series. And it also means if you go to query,
00:34:11
Speaker
What you're going to do if you're putting an hour's worth of data and you haven't reorganized the data is you're going to touch six files, right do You're going to have to look up in each of those files where that time series is You're going to have to decompress the blocks. You're going to decompress six separate blocks, and then you're going to merge that up and send it back to the user.
00:34:31
Speaker
right yeah Now, that's fine for just an hour, but as you get into longer and longer time ranges, it becomes slower and slower. The point is, we're going to reorganize it so that once an hour,
00:34:43
Speaker
We're going to take those 10-minute blocks of time, and we're going to shuffle all the data. So what it means is you are literally rewriting every piece of data that you received in the last hour and rewriting it into new files. But what that means then is when the query comes in, you have one index that you reference for that one-hour block of time rather than the six different indexes that you reference.
00:35:05
Speaker
So one index that you reference, you find the time series in the one place where it is, and you get the 60 samples, and you send it back, right? yeah So basically what you did there is you made it so your query had to do a sixth of the work.
00:35:22
Speaker
but Before, when the query came in, you had to reference six separate indexes and look at six separate files and decompress the data and send it back. Here, now, because you've just reorganized the data for the hour, when you query that hour, you only have one index, one file.
00:35:39
Speaker
Yeah, you've amortized the cost of merging that query out, right? Yes. yeah yeah So and so that that process of rewriting the data is called compaction. right So again, like that that is part of the TSM tree, and we took inspiration from log structure merge trees, right because like compaction is very much a part of the log structure merge tree.
00:36:03
Speaker
So one of the other things we did was we created what we call shards of data, which are basically just windows of time. so essentially, like by default in Inflected V1 2, shard of data is seven days, right?
00:36:21
Speaker
And what that means is you can write data for any period of time, but if it comes in, we will split it out into the separate shards that it belongs in And what that does is it kind of like ring fences the amount of work that the compactor will have to do.
00:36:40
Speaker
And it also means, even though you can load data for any period of time, the most common thing in our use case is people are loading data roughly for now, modulo some collection lag or whatever, which usually is measured in minutes or hours at worst.
00:36:58
Speaker
Yeah. Right. Even people who have sensors that only connect once every 12 hours, like your collection lag is going to be like 12 hours. Right. So what that means is ah shard after a period of time will become what we call cold.
00:37:11
Speaker
It's not receiving any more rights anymore. And then what that what that means is what we can do under the hood is we can do what's called like a full compaction where we say, okay, we don't think this data, this period of time is going to get more writes.
00:37:25
Speaker
So we can spend a little bit of extra effort to do a complete rewrite, a reorganization of the whole thing. Now, In many cases, like the full compaction actually doesn't result in all of it getting rewritten because there are various levels and like other things have happened. And like this shard is not kept in like one file, right? It's kept in potentially tens, hundreds, or thousands of files.
00:37:48
Speaker
um So we do the full compaction. And the basically what that means is There are entire swaths of the database, the historical data, where the database the operational part of the database can just ignore it. right It doesn't have to worry about, do I need to compact it? right ah This goes back to the cache invalidation thing, which is like you have a flag on those old shards that says whether or not they received a right.
00:38:14
Speaker
And then you can say, like okay, now I need to look at it again and see if I need to reorganize the data. So what you've done overall is you said, we'll take 10-minute snapshots because that's a sensible amount of data to snapshot quickly and move on.
00:38:28
Speaker
But now we've also put an upper bound on how often we'll re-merge this for performance. Yeah. Yeah. That's... That's a lot that's pretty massively simplified from how it actually works under the covers. that It doesn't work exactly like that, but that's basically like the structure. I'll take the logical model.
00:38:47
Speaker
I don't think we need to get into actual writing bytes. and The other thing is like I haven't looked at that particular bit of code in many years. So all this is all of this is from ah my faulty memory. It seems pretty warm in the cache, I have to say.
00:39:04
Speaker
I wonder if there's anyone ah anyone else in the tech department at Influx screaming, it doesn't work like that in their head. ah Maybe not. You'll find out after this goes live. um Yeah, yeah.
00:39:16
Speaker
I was wondering as you were explaining that, do you do you have any transactional problems on these TSM files? have you just got like a single writer dealing with, I write this stuff out, I write the new compacted version and swap over?
00:39:32
Speaker
Yeah, we have a single writer. Single writer. Yeah, so basically and when the when the swap happens, essentially there's like you know there's a there's there's always like a write-ahead log that captures the the operations, and then there's ah an in-memory structure that is protected by a mutex that the we swap over. so Okay, okay.
00:39:56
Speaker
And the the other thing is like the for the storage engine, there's logic under the hood to to basically do like deduplication of the data as you read it. right So as I mentioned, like you can write you can write ah data to a time series for any period of time, but the the data, the like the rows in a table, so measurement, the line, we have a ah protocol called line protocol, which is basically like you're writing like rows of data into specific measurements.
00:40:33
Speaker
Mm-hmm. the rows are identified by the tag set and the timestamp. You can think of that as kind of like the primary key. yeah so one thing is people can like overwrite existing values, or they can overwrite you know the same tag set and timestamp with updates or additional fields or whatever.
00:40:52
Speaker
And under the covers, what we have is stuff that will combine the data from multiple TSM files into one merged, deduplicated stream. right So it means that One of the nice properties of the database is like you can if you if they if the writer is providing the timestamp, which is what we recommend for the data, then you can essentially like just retry it if there's any sort of failure and not worry about duplicates and stuff like that. like the The database dedupes everything for you. ah Yeah, so yeah you can it's kind of idempotent, right? Eventually, if I send the same data over and over, it will boil down to a single value.
00:41:33
Speaker
Yeah. It also means you can do things like upserts and stuff like that, which some people do. um It's a little weird because ah you know that ultimately like that has to be handled by the compactor. So if you're like changing or updating data or whatever, it does mean like the compactor is going to have to do some work on the back end to reorganize it for you. so There's, you know, it it it all depends. Like some workloads are more expensive than others based on the work you have to do.
00:42:05
Speaker
Yeah, yeah. Does that mean you're capturing two timestamps, the kind of logical one that the user provides and the timestamp actually arrived to find out which is the newest?
00:42:16
Speaker
Or you just say whatever's newest in the write ahead log wins? Whatever's newest in the write ahead log and then the actual TSM files that get written to disk are ordered. Right? Okay.
00:42:28
Speaker
So we we don't use timestamps for those. We actually just use like sequence numbers because clocks are liars and can't be trusted. but yeah Okay, you've you've preempted my first quote my next question. So, yeah.
00:42:43
Speaker
Okay, I start to see the shape of this.

Transition from Go to Rust in Database Development

00:42:46
Speaker
um and i think so. So you you've gone through three versions and you've been deliberately leaning on versions one and two.
00:42:55
Speaker
How has it evolved from that logical model? Yeah, so version one was the time series merge tree and then TSI. Version two, we used the same core database technology, but then we added our own scripting and query language called Flux.
00:43:14
Speaker
We added you know a bunch of stuff around the around the edges of the database. We launched like a multi-tenant usage-based SaaS service, which I can tell you all sorts of nightmares about doing multi-tenancy in databases and how I can tell you about the right way to do it the wrong way to do it. Sadly, I've done it the wrong way before. so think we get into that.
00:43:37
Speaker
And then for version three, um because I'm a sadist addicted to pain, I decided it would be a good idea to rewrite the database from scratch, from Go into Rust.
00:43:52
Speaker
ah And this okay it this is the version, we're ah we we're selling a version of this as a distributed database. We have in beta now a new open source release of version 3 and another commercial product based on that. But there were there were reasons that we decided to do the rewrite. so First, tell me when you picked Go in the first place.
00:44:21
Speaker
So the company started out as a different company called Airplane, like E-R-R-P-L-A-N-E. We just thought it was funny.
00:44:32
Speaker
ah And the idea for that product was essentially it like a real-time metrics and server monitoring product, SaaS, like in the same vein of like Datadog and stuff like that. But this was in 2012.
00:44:44
Speaker
two thousand twelve um And to build that product, we actually had to build like a time series database. And initially what I used was like Scala on top of Redis and Cassandra. So Cassandra is the actual long-term data store, and Redis is like the real-time indexing layer.
00:45:01
Speaker
okay um And in twenty twelve in like December was like, well... That infrastructure is kind of pain the ass to deal and manage with. It doesn't have a bunch of features that I want it to have. And also i had this idea that we would want to be able to offer our software as like an on-premise solution.
00:45:21
Speaker
and there was no way we were in 2012, we were going to be able to ship, you know, our Rails app, which is our front end plus Scala, Cassandra, Redis, like all this other stuff. So I was like, I got to figure out how to read and a MySQL database, right?
00:45:38
Speaker
Yeah, that doesn't scream package me up as a single binary. No, no. Although now there are people selling the dream that you could do that inside of Kubernetes and whatever. but ah That is ah separate nightmare that I can talk about.
00:45:51
Speaker
But ah I had this idea where I was like okay, well, you know, if I can take the the time series portion of that and package up as one like thing, then we'd be able to, in theory, like ship this as a man's service. Because then we're just talking about a Rails app, this one thing, and MySQL.
00:46:09
Speaker
Fine, that would be possible. So... so Go had gotten to version 1.0 in March of 2012, and I thought the language was interesting. So I prototyped that backend in Go using LevelDB. LevelDB like I said, a time structure but sorry a log structure merge tree. It's written in C++. plus plus ah It came out of Google, written, I think, primarily at that time by Jeff Dean and Sanjay Gamalat.
00:46:36
Speaker
And... and um I just prototyped something and I found that it was like I was able to get like really great performance like with ah very little effort. right Just on my own doing like a couple of weeks of hacky work. So we rewrote the back end for Airplane in that this ah and launched that into production in early 2013 at and the same time we were going through y Combinator.
00:47:03
Speaker
um went Went through 2013, I raised a seed round of funding for this company, for Airplane, and realized by the fall of 2013 that this just was not going to take off. We were not going to be successful with this idea, but I thought the infrastructure that we had built was super interesting.
00:47:22
Speaker
was like, well... i actually We actually had on our Y Combinator application, like you know they have the question, which is like, what would you work on if you weren't doing this idea that you're submitting with?
00:47:33
Speaker
And what we put down was open source time series database. Because i was absolutely thinking of it at that time, because in 2010, had worked at a fintech startup, and I had built a time series solution for financial market data.
00:47:47
Speaker
And then when we did Airplane, had to build basically the exact same solution, but this time for server monitor. data. And I was like, okay, there's like there's a common abstraction here. i think it's time series.
00:47:59
Speaker
And I also said open source because you know it's 2013, 2012, 2013. And I'm like, if we're going to build a piece of infrastructure software, there has to be some core part of it that's open source. Otherwise, nobody's going to care. Nobody's going to do anything with it.
00:48:13
Speaker
And this was before it was like accepted wisdom that all database workloads were just going to be SaaS in the cloud managed by somebody else. Right? yeah ah So in the fall of 2013, I was like, okay, let's focus on maybe we should try this open source thing. Maybe we should try this time-sphere database thing. And we're like, we can't just like take the bits that we had written for Airplane and just like open source it. That's not going to work. We had never designed it with that mind. like But all the ideas are good, so we can do that, plus we can do a couple of other things. right That thing had like ah REST API on it,
00:48:50
Speaker
and I was like, okay, well, it'd be better if it had a query language. And if that query language looked kind of like SQL, people would think it was like friendly and approachable and whatever.
00:49:04
Speaker
um So that was the first thing with the change that we had decided on. And then the next point of debate was, you know this is September of 2013. And essentially, at that point, Rust was not far enough along. It wasn't even really on my radar at that stage.
00:49:20
Speaker
you know It was another couple of years before Rust would even hit 1.0. And async stuff didn't even really land for real until like fall of 2019. So basically, the debate at that stage was, do we write it in C++, Go?
00:49:38
Speaker
or go And all the, you know, the three of us working on it at the time, we had the most familiarity with Go. We had all done C or C++ to some bit, but never like a ton of it professionally. Like at that stage, I think, you know, I'd written maybe like 20 30,000 lines of like C and C++. Not that much. Okay.
00:50:00
Speaker
not okay um and We were just like, okay, well, if we do it in Go, what's the biggest risk? Our biggest risk at that time that we were worried about was the garbage collector, right? We're going to write ah ah database project and garbage collection is like, you know, seems very bad. But was like, well, people have written some successful databases in Java, in Java as a garbage collector. So if they can figure it out in Java, certainly we can figure out in Go. And the bet we made was that, you know, we thought we can move faster if we do it in Go.
00:50:32
Speaker
and Go seemed like in 2013 that had some real momentum. It was really going to start picking up steam. And we just thought, you know, like, it's probably going to get better over time.
00:50:44
Speaker
And we'll be able to just, like... you know, piggyback on those improvements. So let's just do it and go. Because our biggest risk right now is like, we need to make a piece of software that anybody in the world cares about.
00:50:55
Speaker
so yeah you know, the risk is we don't ship anything that anybody cares about. It's not that like, it's not performing enough or whatever. I mean, although performance is very much like the reason why people would care, we already had enough evidence that would work well enough. So we're just like,
00:51:11
Speaker
We're going to do it in Go. And I would say for those first things you know three, four years of Influx as a project, I think we really benefited from the fact that it was in Go, right? like We were able to speak at some conferences and get some visibility. And were also able to hire in people that, you know, they wanted to work in Go. And we could hire in some great developers who were really, like, excited about being developers, which are the people you want to work with, I think, as a developer.
00:51:37
Speaker
ah And, yeah so So that was that. um But... in 2020. Yes. yeah so fast to. When did the dream die?
00:51:49
Speaker
Yeah, well, it's it's not that the dream died. Go is still a fantastic language. And I still, there's still things I love about Go that I think are hands down, it beats Rust, right?
00:52:01
Speaker
learnability and compile times it's not even a debate okay right i i don't think i don't think any reasonable developer can disagree that those are better in rust but maybe maybe i'm but i'm sure the internet you can always find someone to disagree that's true somebody would take the other side of that argument i'm sure um but uh you know In 2020, first started getting interested in Rust in 2018.
00:52:29
Speaker
And I took the time to learn it the fall of 2018. And I picked like ah this project of like building your own programming language, like a parser and all this other stuff, as um it a fun project to learn it. And I wrote this blog post about Rust. And at the time, I thought, like this language is super interesting. like It has all these properties that make it great.
00:52:51
Speaker
I did think it was an incredible pain in the ass to learn. It had been years since I had been that frustrated as a programmer trying to learn something. and i like It took me multiple attempts like a long, long time ago to really learn how to write code. like I think I had probably made like two failed attempts before I actually really started to get it.
00:53:12
Speaker
But you know those two failed attempts were like... in high school or like middle school. Right. Yeah. well When I was trying to learn C++ plus plus off of this, like learn C++ plus plus in 21 days book or something like that.
00:53:25
Speaker
Yeah.
00:53:27
Speaker
You do get this thing though, don't you? Because you a lot of programming languages are roughly the same. And so when you pick up a new programming language, it comes very easily. like There's not a wild amount of difference between learning Java and learning Python, I would say.
00:53:43
Speaker
So yeah you can pick it up and then you come to a language that is quite different and the the hill suddenly hits you in the face. Yeah, I mean, to me, at that stage, i had felt like, okay, there were they probably three different like learning curves for programming languages before I'd experienced Rust.
00:54:01
Speaker
right And those learning curves, I think, are basically like static typing and like object orientation. right So Java, C Sharp, whatever. Dynamic languages and duct type languages like Python and Ruby and JavaScript. And then functional languages. For me, my only functional exposure was essentially a Scheme.
00:54:19
Speaker
which I just learned is like a nerdy exercise not to actually do anything real. um So most of my exposure at that stage was either in C Sharp or Java or Ruby or JavaScript, right? And then and then then obviously Go, which I just view as like another you know statically typed language that's actually simpler to learn than Java C Sharp.
00:54:44
Speaker
Yeah, yeah. so Okay, so you get to the point where you're over the learning hurdle of Rust and you're liking it. What happens next? So, yeah, so then 2019 comes. And essentially, like, at this stage, you know, business, 2019 is, in the fall 2019 is when we launched...
00:55:05
Speaker
the first part of version two. But so our business at that stage was really all version one of InfluxDB. um and And we had, a and like I said, version two, from a database perspective, the core elements, the underlying like core implementation of the database was the exact same. we used all that same code.
00:55:25
Speaker
So, ah so, you know, like, there There were problems related to the database that people wanted us to solve. So 2019, we launched version two with this new language and we start getting a bunch of feedback on it. And a lot of the feedback is very negative about the new language, right?
00:55:45
Speaker
People don't want to learn our new language. They think it's dumb. They're like, you know, we want to we want't want to either use SQL or InfluxQL, which looks kind of like SQL for querying.

Handling High Cardinality Data Efficiently

00:55:56
Speaker
And we want to use whatever programming language we're going to use for our programming stuff.
00:56:00
Speaker
which is not an unreasonable thing. Now, by, ah but, but the more important things I'll, I'll set the language thing to, ah to the side for a second. The other important things were,
00:56:13
Speaker
really like core database performance property things. So the biggest problem was what I call high cardinality data. right So essentially, like as I mentioned, you have the tags, which are basically they define they it's metadata that describe your time series.
00:56:28
Speaker
So cardinality basically refers to the number of underlying time series you have in the database. How many individual unique time series do you have stored in the database? um And like I said,
00:56:40
Speaker
In some database formulations, you actually don't store those time series ahead of time, right? You actually produce the time series on the fly from the raw underlying data, right?
00:56:51
Speaker
You could take just a raw log file and process it into time series, right? And you could you could actually build a query language that just does that, right? Loki's query language does that. it Basically, like you can construct time series on the fly.
00:57:06
Speaker
um So our part was all the underlying time series mapped to actual and unique structures in the underlying storage. So the number of time series exploded, everything became more expensive. right So reorganizing the data in TSM files became more expensive.
00:57:22
Speaker
And then the bigger problem was the index became super expensive. So where you have tag key value pairs, the number of time series that expands over time or the number of unique tag values you need to track in your index.
00:57:39
Speaker
And as that gets bigger and bigger, it gets more expensive. And if you have data where there are a lot of unique things, then your index end up ends up becoming even larger than your raw time series data, right? So a great...
00:57:57
Speaker
you know, degenerate use case is tracing data, right? Tracing data, you are frequently pulling time series out of tracing data. But if you're to keep it in a strict time series format, right, you'd have the trace ID as a tag and you'd have the span ID as a tag. And what that means is for every row that you write into the database, you're writing unique,
00:58:23
Speaker
a unique span ID at the very least, a unique trace span ID combination, right? Because that represents one point in time and it represents a time series with only one data point in it.
00:58:36
Speaker
And then like obviously when you query, you usually you're not querying for a specific span, you're querying for give me the yeah this trace or give me some aggregation of the tracing data across this.
00:58:47
Speaker
Doing that with the tag structure was completely, you literally couldn't do it. so Yeah, I can see the index exploding on that. And there there are other use cases where it's not as extreme, but it does become difficult. right Like, say, a process ID yeah or a container ID. right like You spin these up, they're going to run for some period of time. It could be minutes. It could be hours or days.
00:59:10
Speaker
And that is actually a legitimate time series. And the problem is the number of those expands over time, and it just becomes this like super expensive thing. So we had to solve, how do we store data where we can yield time series from it, where the underlying data could have effectively unbounded cardinality?
00:59:30
Speaker
Yeah. Right? Which... That's something we wanted to solve because influx InfluxDB, like my vision is not that like we're creating a metrics database. Metrics is interesting, but to me that's a subset of what's interesting about time series data. like Ultimately, we want to store we want a database where you can store literally any kind of observational data about the real or virtual worlds that we create right in servers and software and whatever.
00:59:59
Speaker
real-world sensors, machines, anything out there in our physical reality, you can store all of this kind of data and you can execute queries against it and get time series back. right That's where I want to get to.
01:00:14
Speaker
but Right. Having this underlying structure meant there was no way it was going to work for all those varied use cases. And we had customers and users telling us, we need support for infinite cardinality. Now, it's not everybody, but it's a very large, vocal group of people.
01:00:32
Speaker
And then one of the other problems we had to solve for was how do we keep historical data and not have it be super expensive? right So InfluxDB 1 and 2 required you to have a locally attached disk.
01:00:44
Speaker
We told everybody the best thing to have is locally attached SSD drive. right ah And the problem is, like I said, the the data that you query, like you know most of your workload is against this very recent data, and you want to be able to query the historical data for ad hoc analysis or whatever.
01:01:03
Speaker
But that's you know a tiny fraction of your query workload. And to have your massive amounts of historical data have to be stored on a locally attached SSD is super expensive. So people wanted object storage.
01:01:16
Speaker
They want to be able to have historical data in object store and all this other stuff. I can totally see that you want to offload those seven-day chunks you were talking about to S3, right? Exactly, exactly. But still have them be accessible.
01:01:29
Speaker
And then the other big part was people wanted query language that they knew. you know So Flux, you know we still run and operate Flux. We still ah have customers on version two.
01:01:43
Speaker
We have a partnership with Amazon where they host open source version two, and they now host another commercial version of version version two that we we ah launched with them.
01:01:54
Speaker
um so we support Flux still, and Flux has a group of people a a ah minority, but a group of people that are very enthustic enthusiastic about the language and what it can do, right? Because like I said, it's a scripting language that gets executed inside the database. So you can do all things a lot more things than you can do with just a basic declarative query language like SQL.
01:02:17
Speaker
um But By and large, the majority of people responding to version two of InfluxDB wanted either InfluxQL for like simple queries where it made sense, or they wanted they wanted full SQL.

Integrating SQL and Rewriting Storage Systems

01:02:32
Speaker
They wanted full SQL engine where they could do joins, they could do all these other you know windowing and partitioning and all this other stuff. And when I looked at, okay, we've got to solve the cardinality problem, we've got to solve the object store problem, that basically means we're rewriting the storage engine.
01:02:47
Speaker
We've got to solve the SQL query problem, which means we're rewriting the query engine. I'm like, if we're rewriting the query engine and the storage engine in the database, that is a rewrite.
01:03:00
Speaker
Yeah. Okay, so totally. I'm not sure I see the... I would be very wary if I were in that situation of rewriting both at once in a different language.
01:03:14
Speaker
Yes. Yes, that is true. Yeah. yeah And i again, if i if I were to go back to Paul at the beginning of 2020, I would tell him to do it a different way.
01:03:28
Speaker
but okay Let's start with a decision you did make. I want to know benefits of hindsight. but what So you chose to rewrite the whole thing as a big bang in Rust?
01:03:39
Speaker
Yes. Yes. ah So again, like ah chose Rust. because it was 2020 and you know the async wait stuff really landed in the fall of 2019. At that point, I'm like, okay, Rust is going to be good for server-side software.
01:03:54
Speaker
right The performance is great, zero-cost abstractions, great error handling, ah you know ah fearless concurrency, cargo and the package management system is great.
01:04:06
Speaker
I guess Go solved that problem, but like I stopped doing Go in 2016 and that... was At that point, is still very much a nightmare. Okay.
01:04:18
Speaker
So, yeah, I thought Rust was a great language. And then then the discussion was... Well, and why why would I do that versus just doing it and go? Well, one, like not having a garbage collector and being able to optimize everything we wanted to optimize to whatever degree we want, to me was very appealing. Those other parts of the language, like the fearless concurrency and the error handling were also appealing. um Had you hit the point where the garbage collector was a genuine problem?
01:04:49
Speaker
ah There are a bunch of places where where it was potentially ah where it it could cause problems, basically. um i mean, again, like we could have continued working on the Go code base and making improvements.
01:05:05
Speaker
But... I thought, well, if we're going to rewrite the entire thing, like I want to use what I think is the best tool in 2020 for this job. And I think Rust is a better tool for this. like If I were to like just start a fresh project from scratch right now to build ah database or a piece of high-performance server software, I would do that project in Rust.
01:05:29
Speaker
There's no other language I would choose. um so Okay, okay. ah And again, like... It's debatable about whether or not the benefits were big enough to to warrant making the the switch.
01:05:44
Speaker
But I felt there were. The other part of this was we needed a query engine, and I did not want to write the query engine ourselves. right We had done that with version one within FluxQL. We did it all over again with version two and Flux.
01:05:57
Speaker
And with Flux, we actually wrote not just a query engine, but with a planner and optimizer and execution, we also write wrote a scripting language. right So it was...
01:06:08
Speaker
a whole bunch of stuff. And I was like, we're not we're not going to write our own query engine again. We're going to pick something off the shelf. And at that stage, I was fairly certain that the query engine we picked off the shelf would be written in C or C++ plus plus because it would come from an existing database, right? Because we needed SQL. So it was going to come from an existing database that supported SQL.
01:06:31
Speaker
And we were just going to rip their query engine out and put it in here. And I thought it was going to be... I thought it was going to be in C or jobs of like Initially, this started out as, like it didn't start out as, like we're going to rewrite the entire database and this is going to be version 3.
01:06:47
Speaker
It started out as, we're going to rewrite the storage system, bring that into the Cloud 2 product, and we're going to also like build the, use the query this query engine on top of it. so One of the first bits of research was, what query engine are we going to use?
01:07:03
Speaker
and In June of 2020, Andrew Lamb and andrew la joined us in May of 2020, and one of his first jobs was to look at what was out there and see what we should use, right?
01:07:20
Speaker
And we looked, he he looked at ClickHouse, DuckDB, and Data Fusion, right? And okay in June and July 2020,
01:07:33
Speaker
There was no ClickHouse company, it was basically just some project that had been open sourced out of Yandex and no it didn't have much traction or visibility. DuckDB, there was no DuckDB Labs, there was no Mother Duck, there was no company at all. It was basically just some postdocs from CWI hacking together this project.
01:07:50
Speaker
right Data Fusion had been donated by Andy Grove to the a Apache Arrow project in the fall of 2019. And basically like that code base was largely just written by Andy and nobody else.
01:08:03
Speaker
So these were, I would say, from a maturity perspective, ClickHouse was the thing that was furthest along. But it lacks in things that we felt we needed to make the time series use case work.
01:08:18
Speaker
ah well. And basically when we looked at those three projects, we realized, well, whatever it is we pick up, we're going to end up having to contribute to and own it in a very significant way.
01:08:30
Speaker
right like It's not going to be something that we can just use and like forget about and it just works. It's going to also be the thing that Data Fusion is the only one on that list which is intended to be embedded in other things.
01:08:43
Speaker
Yes. Yeah. So Data Fusion was appealing for Three reasons. One, because of what you just said. It was intended to be used as basically a library. It's like a toolkit for building a database, right? Because it's got the parser and the query engine, planner, optimizer, whatever.
01:09:01
Speaker
yeah So it actually had a structure that lent itself to that. The other what they was that it was written in Rust, whereas the other two were not, right? The other two in C++. plus plus um And then the last bit is basically governance. It was under Apache.
01:09:18
Speaker
So we knew, yeah like, and was i didn't know, I didn't know in like, DuckDB Labs got formed, i think, in like August or September of 2020. I didn't know in July of 2020 that that was going to happen.
01:09:35
Speaker
ClickHouse, the company, I don't even know when that started, but it was a while later. It may have been in late 2020 or 2021. um I don't remember which, but definitely at the time, ClickHouse, the company, was not, at the very least, it was not a public thing that that was a thing that had happened.
01:09:52
Speaker
So having it be in the Apache Software Foundation was appealing because we were like, okay, again, like, the the view we had is like we don't want to write our own query engine, but we're obviously going to have to contribute to one very heavily in order to get what we need.
01:10:09
Speaker
yeah But essentially, like you can think of the query engine as a dependency that we have. It's a dependency to deliver the ultimate product we want to deliver. And one of the best... use cases for open source is to commoditize your dependencies.

Open Source Dependency in Database Development

01:10:25
Speaker
You want your dependencies to be as cheap as possible, as widely used as possible, because the more use they get, the more robust they become, the more reliable they become. more chance the feature you're looking for has already been done for someone else.
01:10:39
Speaker
Exactly, exactly. So we're like, the query engine is a dependency for us. It's not actually where we like make our money or add our unique you know benefit to the world. It's just a thing that we need in order deliver this like other larger database and service.
01:10:55
Speaker
yeah So having it be Apache, we thought, okay, if we start contributing to this thing and we use it in our project, ideally like we can... We can promote that, and it will build up momentum over time, and people will contribute to it.
01:11:10
Speaker
So that's how we landed basically with Thrust, with Data Fusion. We picked arrow ah Arrows, obvious, because like Data Fusion uses Arrows. We use Arrow internally as the memory-based format for a lot of the data.
01:11:23
Speaker
We also landed on Parquet. as the file format for the data. And again, the the reason we did that is because we thought, okay, what we want actually is a database where a compute is separated from storage and storage is object store.
01:11:39
Speaker
And we want to keep parquet files in object store. And the dream that we had at that stage was if they're parquet fail files in object storage, we get really easy integration with downty downstream third-party services. right like please ah r yeah yeah Our focus is this like real-time time series database use case, largely against recent data.
01:12:02
Speaker
But we want to make it easy for our customers and our users to access that time series data in large-scale data warehouses and query systems like Athena, Snowflake, Databricks, whatever. we thought, oh, if we if we have it in parquet and object storage, people are just going to be able to like pull from those other things.
01:12:24
Speaker
Yeah, the standard will work for you. Yeah, that isn't reality, the the reality that we've experienced, but that was the dream. um Okay, now've i've we've got to go down that side road quickly. what Why does that dream not work out?
01:12:41
Speaker
Because there are all sorts of different pieces of parquet of the format, and not all downstream readers support all pieces. So for us, more most specifically is nanosecond timestamps.
01:12:57
Speaker
Nanosecond timestamps are supported in Parquet, but downstream people don't really support it. Iceberg, for example, ah the initial Iceberg spec did not have support for nanosecond timestamps. They had support for microsecond timestamps.
01:13:11
Speaker
We added support to the Iceberg spec for nanosecond timestamps, but that has yet to thread its way through all of the downstream Iceberg implementations and places where you'd want to read Iceberg.
01:13:23
Speaker
So what it means is The Parquet files that we write right now, the default format, is not actually readable by all downstream third-party providers. right If we want to make it readable by downstream third-party providers, have to rewrite that Parquet file into a slightly different Parquet file so that can be queried by downstream providers. That's the problem. is like Yes, Parquet is a standard, but it's like SQL is a standard. like then There are all sorts of different flavors and
01:13:54
Speaker
depending on how you do the integration and where, right? This will change over time, particularly, I think, as more people adopt Iceberg and build around it. But I think that process is going to take years, right? and Yeah.
01:14:08
Speaker
I can see the dream is slowly loading on that front. It it is, but the question is, is actually an achievable dream? Or is it basically just a ah red herring where people want the thing to be true, but ultimately it's not true?
01:14:23
Speaker
but Like the dream of write once, run anywhere, whatever. Oh, yeah, yeah. the The thing that Java World stopped saying after a few years entirely, right? Right. Yeah.
01:14:35
Speaker
Okay. um let's Let's come back out that side alley because you, from from the rewrite into Rust looking like a desperately risky thing, i can now see why at the time you were thinking, okay, we'll get query engine largely for free. We'll get standardized data format for storage and object storage largely for

InfluxDB Cloud Product Challenges

01:14:56
Speaker
free. This suddenly doesn't seem quite so risky to rewrite entirely in Rust.
01:15:00
Speaker
Yes. But you so you in hindsight, you regret going down that path. Why? Well, so so again, like the the plan, and this was in 2020, the plan was we're going to rewrite it as the back-end storage system for our Cloud2 offering. And at this stage in the company's evolution, we very much thought that our Cloud2 offering was the future of the company. That's where everybody was going to go. Everybody was going to use it.
01:15:25
Speaker
And over the course of the next year and a half, we learned some really important business and technical lessons. okay so our Cloud2 product is like the, code there's so there's open source version two of InfluxDB, which is written in Go. It's this all in one thing. And it basically has a user interface, it has the background processing engine, tasks, and it has the query engine and like storage and all this stuff, right? It's just like, you go you install that, you run it.
01:15:57
Speaker
now The commercial version of version two is a cloud SaaS product and it's comprised of a bunch of different services and the whole thing runs inside of Kubernetes.
01:16:08
Speaker
It's very complicated like thing. um And it uses bits and pieces from that open source version two, but largely it is a completely different database and a completely completely different implementation, different architecture, and it serves the same api and the same ui plus a bunch of additional UI features that only make sense for our cloud offering.
01:16:32
Speaker
So what that means is we were writing two databases, one that went out as open source and free, and one that was as a cloud product. and that And the problem is, the architectures of those two databases were completely different. So the performance and operational characteristics of those two databases are different.
01:16:51
Speaker
And this is kind of unavoidable given the design of the two things. Now, this creates a problem as a business because everybody who starts, you know, go back to 2020, 2021,
01:17:02
Speaker
twenty twenty one 2022, any one of those three years, anybody who starts with InfluxDB downloads the open source thing, they use it, they love it, they say, great, I want to go to the cloud-hosted version of it, and they move over, and basically there are some performance and capability differences between the thing they played around with in open source and now the thing they're paying for. That's one.
01:17:24
Speaker
yeah Two, the cloud-hosted version of InfluxDB is... usage-based, right? And we we really wanted to get to usage-based because we thought, one, we can offer a free tier pretty easily on that.
01:17:41
Speaker
Two, We can charge people. is So you pay for data in, right in terms of bytes of data in. You pay for data at rest.
01:17:52
Speaker
You pay for number of queries executed. And you pay for data egress. But that that is just a price forwarding from like what we get charged by the cloud provider.
01:18:04
Speaker
We just can charge you whatever the cloud provider charges us for egress. now On the query side of things, we actually didn't start with number of queries. We originally started with compute time, basically the amount of compute time that the queries that you executed used.
01:18:21
Speaker
Makes sense. People hated it. They couldn't understand it. They're like, I have no idea how much this costs. Like, there's no, I'm not going buy this. So we changed it to query count.
01:18:32
Speaker
And basically when you change something, so the truth is like the compute time metric is the fairest metric on which to bill because that actually represents the amount of our resources that you used that we ultimately have to pay for.
01:18:48
Speaker
yeah if you If you switch over to something which is a rough proxy for resources, what you have to do is you have to set a price that's actually more expensive.
01:18:59
Speaker
Because an individual query an individual query could take five milliseconds or it could take five seconds. So that delta...
01:19:12
Speaker
is huge, right? There's a massive, massive delta in terms of the amount of compute that you use for a query. so if you're just going to bill on query count, you have to make sure that you make the queries expensive enough So that you make for... To absorb the cost of the more expensive ones.
01:19:27
Speaker
Exactly. exactly yeah So what it means is it's but it's basically a less fair way to price, but people prefer that way because they find it to be more predictable. Because they can they can easily back into, I know how many queries per second i'm going to send to the service, and then I can figure all that out. and but what But what that also means is...
01:19:48
Speaker
The pricing doesn't look reasonable when you're when you run an open source Influxtv2 instance on a cheap VM, and you're like, this thing can do all this stuff for a fraction of the price that your cloud service is advertising.
01:20:04
Speaker
Right? Oh, yeah, yeah, yeah. Oh, that's really thorny from a technical and a business point of view. Yeah. So i mean the truth is, like the cloud service is doing a bunch of other stuff that you know the open source influx to be running on a VM isn't doing.
01:20:19
Speaker
It's providing backups and monitoring. It's highly available and all this other stuff. But at the end of the day, when developers are looking at it and they're evaluating it, if the price delta is too high, then they're like, this is this is crazy.
01:20:34
Speaker
And yeah what we what we found, this is maybe too in the business weeds, but What we found was that as customers get larger and larger, price predictability matters more than price fairness. right So what they what they want, as you spend more and more money on a solution, you want predictability. You want to be able to say, I'm doing my budget for 2025 or 2026, and I'm going to allocate X number of dollars to this solution.
01:21:02
Speaker
So you really don't care about usage-based pricing. You care about price predict predictability. um And you don't have the time to dive in and optimize individual queries, which a particular developer might.
01:21:16
Speaker
Right, right. it makes perfect sense. What they preferred from a buying perspective, larger customers, what they prefer from a buying perspective is, just tell me how much infrastructure I'm going to need, and I'll budget for that infrastructure, and you know we can go from there.
01:21:31
Speaker
right they don't want to They don't want to do usage-based. Now, of course, they deal with usage-based with Amazon and Google and in Azure, but that's a totally separate thing. right There's a whole cottage industry around optimizing your spend in the cloud providers because of this.
01:21:50
Speaker
so So basically, like... But again, going back to 2020, our view within the company was like, Cloud 2 is going to be the future of the product. And we are going to build this new storage system and put it into Cloud 2. And the new storage system will bring about those three things, right? Infinite cardinality, cheap object storage, and a SQL engine so that people can do SQL queries.
01:22:15
Speaker
yeah And now we get into the technical lessons here. so We implemented essentially like the storage system based on Parquet and object storage and whatever.
01:22:26
Speaker
And we implemented the SQL engine. And within our Cloud2 offering, the storage system is run as this service. And it has this gRPC API that it exposes.
01:22:38
Speaker
And the gRPC API is essentially, it's not like a full query API. It's basically like a limited, like, you know, give me... give me the data for this measurement for this tag set, ah with and and then it will push down aggregates, right? So yeah it it has some sort of like push down.
01:22:57
Speaker
But then like the actual bulk of the query processing happens in a Flux engine, which is in RAM as a separate service. right So what we did was, at this stage, we're like, we have to support Flux, we have to support InfluxQL.
01:23:13
Speaker
In Cloud 2, those are two separate services written in Go. And what we're going to do is we're going to build that we're going to replace the storage service. right yeah So everything in Cloud 2, all that other stuff that's still going to be written in Go, f Flux is going to be there, InfluxQL will be there, that's all in Go, and we use this storage system.
01:23:32
Speaker
And we We implemented that gRPC API, which used data fusion to query our underlying thing. And we deployed this, and then we we we didn't deploy it for production for customers. We deployed it in an environment, and we mirrored production workloads from existing customers of our Cloud2 product. right We mirrored it over the right workload, the query workload, all that stuff.
01:23:58
Speaker
Makes sense. and What we found was that the performance was absolutely abysmal. It was so terrible. It was so yeah ah and the reason the reason was, so one, like people wanted infinite cardinality, but that's not how they were using the product.
01:24:20
Speaker
and ah There was another piece of this that I forgot to mention, which is our customers have long been asking for the ability to analytics-style queries against time series data. So I mentioned the time series queries are largely like, give me the values for this time period for this time series, or this few dozen time series, or compute it across these 2,000 time series, compute some sort of aggregate, right? But it's it's basically all limited by these individual time series.
01:24:51
Speaker
yeah They also wanted to be able to do, and Influx would be one and two, are terrible at this. They wanted to be able to do like a fleet-wide calculation of like you know give me the 10 highest CPU load servers across the entire fleet over the last 30 minutes. Influx would be one and two if you have a large enough fleet, like to just want that query won't return, or it'll boom the database, or it'll be like absolutely terrible.
01:25:19
Speaker
right And again, that's because of how the data is organized and how the query engine works to like do the time source thing. So... Those kinds of analytic queries, more and more of our, particularly, again, our larger customers were saying, we need to do these kinds of queries. We need you to help us do this. So we need infinite cardinality. We need these analytic queries to be fast and whatever.
01:25:39
Speaker
And basically what we had built with this first it you know part of the storage engine replacement was infinite cardinality, check. Object storage, check. query SQL query engine check and ah analytic queries are super fast, right? that we you know When we benchmarked analytic queries versus version one or two, some queries were like literally 10,000 times faster. It was like completely insane, right? And we're just like, this is great.
01:26:05
Speaker
Yeah. and We do the production mirroring. And guess what? like None of the workloads for our current customers have those properties. None of the workloads care about those things. All of the workloads care about the things that version 1 and 2 are actually good at.
01:26:22
Speaker
The things that they were optimized for. and again, like I did not appreciate this going into... This is one of the reasons why you don't want to do a full rewrite is because...
01:26:33
Speaker
you You solve, this a classic problem where that you solve for the problems you see everybody mentioning, and you forget about the problems that you had already solved. And you forget like you don't.
01:26:44
Speaker
So basically, we created a solution for this set of problems, but we didn't also solve the previous set of them. Yeah, you didn't carry across your history of solving difficult things. Yes.
01:26:55
Speaker
Yeah. Ouch. Ouch. Ouch. Yeah. And I will say, i do i do think there you know those two workloads are probably at odds with each other in terms of like how you execute against them.
01:27:09
Speaker
I'm starting to believe that you actually need to keep two copies of the data, optimized for different styles of workloads, but that's a totally separate thing. how So yeah, we learned that lesson. And then basically by 2022, by the summer of 2022, the business lessons of the Cloud2 product were becoming obvious.
01:27:35
Speaker
And the technical problems were becoming obvious. So basically it was all coalescing into this. um And how do you know, twch at this stage, or did you hit that point and realize, oh, we can't, we can't actually ship this.
01:27:51
Speaker
We, we had not launched at this stage. No, we were still very much in development. And at this stage, the people working, so, uh, 2020, it was me, uh, this guy, Ed, we started and initially on the work, you know, he was one of the core engineers on, on version one and then on version two.
01:28:11
Speaker
So he was like one of the core database engineers. And so the two of us started on the work, and then Andrew joined us in May 2020. And then by November of 2020, I'm like, okay, we had selected the technologies like Rust and Arrow and Data Fusion and Parquet.
01:28:28
Speaker
So I'm going to give this talk in 2020 about how we're building a you know new core engine of the database. ah We need Rust developers, right? The idea was we can actually get people interested. And and we had, i think...
01:28:43
Speaker
How many was it? Like five people start ah in early 2021. And we brought over a couple of people from from the existing team.
01:28:56
Speaker
So by March of 2021, we had, i think it was a total of like eight or nine people on the project. Out of a team of, you know, at that stage, I think we were like 70 developers 75 developers. Okay.
01:29:10
Speaker
but So that's the level of like investment and work we were doing. All those other developers were either working on maintaining and improving version one or working, most of them actually, on version two in the cloud two version.
01:29:25
Speaker
right So we had eight people working on this thing for multiple years ah in isolation.

2022 Restructuring and Cloud2 Operations

01:29:33
Speaker
And then finally in 2022, June we deployed it into you know the cloud and started mirroring the workloads and then like realized like it's not it's not working.
01:29:48
Speaker
ah how So but at the so the the other thing that became apparent in the summer of 2022 was that we had mispriced our Cloud 2 offering, and it was horribly inefficient. So there was a period of time and where we were actually at negative margins for our Cloud 2 offering.
01:30:10
Speaker
Ouch. Yeah, which is the the worst possible thing. And again, like... the I guess the other technical lesson ah we learned over this time was like a multi-tenant service is a terrible idea for a data plane.
01:30:26
Speaker
A data plane should be single tenant. and I'm just going to stick to this. if Well, it depends on the complexity of the data plane, but the more functionality you add into the data plane, the more important it becomes for it to be single tenant.
01:30:39
Speaker
ah Why is that? What's so hard or so much easier with single tenant? Is it just trying to manage different fleets of what should be the same software that isn't?
01:30:50
Speaker
ah So there's the noisy neighbors problem, as always. there is... didn but So I mentioned the complexity of the of the service, right like if you think of s three like particularly what they started with, right you had get, put, delete, and list.
01:31:09
Speaker
And there were there were a lot of like guarantees that weren't provided. right Just because you did a put on something, if you did a list, it may not even show up. right There was no strong consistency there. right And basically what it meant was,
01:31:23
Speaker
There is a very limited set of things that you can do. like it Somebody can't submit a query to s three that causes the system to like go crazy and do a bunch of work that you didn't expect, right at least with that.
01:31:37
Speaker
For the services layered on top, you can get to that, right like Athena or whatever. But that's the point, is those are separate. and those run in their own separate VMs, apart from the underlying data plane, which is the simplest possible thing you can imagine. ah imagine And what we had was, in Cloud 2, we had the shared data plane, we also had a bunch of query servers, and we didn't separate out who got what.
01:32:04
Speaker
So you had noisy neighbors problems, you had the problem where the largest customers actually want dedicated resources. And they want price predictability, so they want to be able to say, look, I'll just pay you for 50 VMs that you're running for me, yeah as opposed to paying on a usage base.
01:32:23
Speaker
And we had built the whole service as this, like, everybody's in anything, and it's just like a bunch of different, know, they're go all getting multiplexed onto the stuff. Right, yeah. um So it mostly quality of service issues.
01:32:35
Speaker
A lot of quality of service issues. A lot. quality Quality of service and then, like I said, the other piece, which is large customers want to have their workloads separated from other people.
01:32:49
Speaker
One, because they don't want their data else you know mixing with other people, but the other is also because of like quality of service. And all then them finally, four, the idea of like reserved instances and reserved pricing.
01:33:02
Speaker
Right, yeah, yeah. Okay, so, um, drip this day. we um kill go on, go on. Yeah. So basically in the fall of 2022, we realized we had to like restructure the company. We ended up doing a ah layoff in the fall of 2022, which a lot of people were doing at that time. Yeah.
01:33:23
Speaker
Um, and at that stage we were like, okay, we're going to try and get, we have like, we've been working on version three, four. whatever that was, I guess two years, almost ah two and a half years, right? We're like, we need to get a version of it out.
01:33:42
Speaker
So we are going to deploy it as part of the Cloud2 service. And we realized the Cloud2 service, like we're going to keep operating that for our customers who are running it, but we're going to change it up so that it's now going to be based on this new storage system.
01:33:56
Speaker
We're not going to migrate those people over, right? Because we already established that the performance is not the same, but we're going to run it as essentially a new service and a new product that people will hopefully sign up for. And then they will have,
01:34:09
Speaker
they will have ah usage that maps to what it's good at. right Those analytic queries, queries against very recent data, time series data, is still very, very fast and good. right It's more like the query of, oh, give me the last five days of this data, or go ah week in the past and give me 24 hours of this data. right right The single time series, not all the single time series queries are good.
01:34:34
Speaker
or nearly as fast in this version of version two, version two cloud, which is still running today and we're taking signups. That's called cloud serverless. It's serverless because again, it's like this multi-tenant thing. um So is the current space of play right now that you've got three different versions out in the wild that you're still maintaining?
01:34:56
Speaker
That's right. Yeah. Yeah. we And we have customers on every single one of them so And that's going to continue to be true for years, for years, for sure. Okay. That sounds painful.
01:35:09
Speaker
ah It depends. ah Like version one, it's not really painful. We have like a small team of people who are really familiar with the code base. They're able to operate it. They're able to help with, you know support and stuff like that. The customer base that we have on that, like,
01:35:26
Speaker
they largely are settled in on their use cases. And we actually still sell version one enterprise and and version one cloud for customers where it makes sense because we're still not at a point where the version of version three that we have is better at everything, right? The vision here is we get to a version three in FluxDB that is literally better at everything than version one and version two.
01:35:50
Speaker
We're not there yet. We will get there. But in the meantime, You know, version one and two like, they're actually really good for some things. And we're going to continue to sell those to customers who want to run it.
01:36:03
Speaker
And we're going to continue to operate it. Right? Like, kind of if the unit economics work, there's no reason not to continue to run and operate it, which they do. Thankfully, our margins on Cloud 2 are no longer negative.
01:36:15
Speaker
They're positive. So we fixed to that problem. But that was the thing. is like One of the motivations for moving the the version of version 3 that we have into our Cloud 2 product as the backend was it's much more efficient economically. right it's it's It's much better in terms of cost.
01:36:39
Speaker
for for the workloads that it services than our version 2 database. right The version 2 database that does all this crazy indexing and it has to have locally attached EBS volumes with provisioned IOPS. and bla like It's basically more and more more expensive. so Because of that margin problem, we're like, we've got to get this new version.
01:37:00
Speaker
We've got to just like switch over to the new storage engine even though it doesn't have all the things we want to have right now. We need to get it launched so that every customer we're signing up isn't costing us money.
01:37:12
Speaker
i mean, there were other problems that were basically non-technical, like business-related problems where we were cutting like really big discounts that we shouldn't have been doing. you like All sorts of like really, really lame business stuff. but Yeah, the realization that when you're building a tech business, you you can you're not just building tech, right? You've got to build the business systems and they're hard and ambiguous and painful to change relatively.
01:37:38
Speaker
Yeah. Yeah. yeah so So fast-forwarding a bit, like we then launched a version 3 product called Cloud Dedicated.
01:37:49
Speaker
ah in late April of 2023.
01:37:53
Speaker
And what that is, is it's a single-tenant service. We run this... Now, this version of version 3 is like a distributed database. It's all written in Rust. It is separated out by ah set of services that run ingest, query, compaction.
01:38:10
Speaker
it uses Object Store. It also uses Postgres as like a back-end catalog system. It's like a complex like services-based piece of software. um But when we launched that, we said we're going to launch this as ah dedicated offering.
01:38:27
Speaker
right So what that means, because it's such like a complicated piece of software, like the minimum footprint the minimum viable footprint of it is actually quite large. So we actually only talk to customers who are going to be spending at least $40,000 or $50,000 a year annually.
01:38:43
Speaker
And we don't have any sort of... like self-serve, I want to try this out, right? If you want to try our cloud dedicated product, you have to talk to our sales team. You have to like go through this, like proof of concept and all this other stuff. Yeah, that's probably the big downside of single tenant, right?
01:38:59
Speaker
It's a bit, it's a painful thing. There are ways to do single tenant where you don't have to do that, but we were not able to do that with this particular release ah version three. Uh, so the beginning of last year,
01:39:14
Speaker
ah I started, i i i peeled off from that team to form a new team to work on a much more simplified version of version 3.
01:39:27
Speaker
And the goal was to have a single server thing that could operate with zero dependencies or just with object storage. It would be easy to deploy, easy to use,
01:39:39
Speaker
there would be an open source version, and there would be ah commercial version. And then this kind of evolved over the course of the last 15 months of development, where I realized, well, if I'm going to use cloud storage as the as the storage piece, we can actually create a distributed version of this simple version 3 product that is...
01:40:06
Speaker
based on it what people are now calling like a diskless architecture. Warpstream, the company that built a Kafka competitor and then later got acquired by Confluent, did this. The people who started Warpstream actually did this in a project at Datadog called Husky.
01:40:22
Speaker
yeah. So the idea is essentially you can use object storage as the distributed system primitive, so you actually don't have to worry about, how do I do replication? How do I do consistency? All this other stuff.
01:40:35
Speaker
You just have object storage and you can operate multiple servers, and they actually don't have to communicate with each other. They can just share data through object storage. so The version of version three that i'm I've been working on for the last 15 months with a small team of people, we just launched the alpha of that in January.
01:40:54
Speaker
We launched the beta of it about two weeks ago, three weeks ago. um And we'll launch the GA of it later this month. ah So that that is basically open source version 3, which is still MIT or Apache licensed that the user's choosing, like things are in the Rust ecosystem.
01:41:14
Speaker
Or this like commercial ah version of it, which offers high availability, scale-out, query workloads, extra security, and all this other stuff. Do you think eventually you'll decompact this to a single code base?
01:41:32
Speaker
I think so, yeah. i well And I will say, the the version of version 3 that you know this thing that is currently in beta, the vast majority of the code from it is actually code from that distributed version that we run right now in production.
01:41:47
Speaker
ah right So basically, we we took all that code and we're using it, we just created some additional pieces and changes in it so that it could be operationally much simpler and have this diskless architecture.
01:41:59
Speaker
um like so it's more of a branch than a fork? Yeah, yeah.

Transition from Management to Coding

01:42:03
Speaker
I mean, eventually, like these things will all converge onto one thing. There are some other big things that we're working on this year they' will be after the general availability release um that will actually bring you know the same performance, really good performance capabilities that version 1 and 2 have into this version, along with infinite cardinality you know analytics queries against high cardinality data and object storage and all that stuff.
01:42:35
Speaker
so yeah Yeah. I wasn't expecting us to talk quite as much about the yeah ups and downs of the business side. ah But that that does lead me into another interesting thing about you that I did want to ask, because a lot of us in on a career trajectory through tech, we get better and better at tech to are promoted into management, maybe start maybe start our own business. But if you do that, then you really go into management and you get trapped in being a manager when what you really ah are or want to be is a techie.
01:43:08
Speaker
Seems to me you've done quite a good and deliberate job of avoiding that. Yeah, so it's it's it's changed over time. like so I would say in interestingly, like I hired in Evan as the CEO. He joined at the beginning of 2016. say...
01:43:27
Speaker
and i would say By middle of 2016, I was writing almost no code. And most of what I was doing was actually managing like and and and talking to our developers and talking to customers and users and stuff and trying to come up with like improvements to to version one and then coming up with a vision for version two and ultimately version three.
01:43:51
Speaker
So there was a period, I would say, from mid-2016 mid twenty sixteen to you know, early 2020 where i wrote very little code and most of what I was doing was, you know,
01:44:07
Speaker
ah man managing, managing, not, not like as a, like manager of a development team directly, but like the, you know, the man managing the managers and like trying to come up with like, how, how are we sequencing the work? What are we working on?
01:44:27
Speaker
talking to customers, talking to prospects, a lot of like sales and pre-sales work. Yeah, yeah, yeah. And you know going out to customer sites and meeting with them. um And then you know i was like i would still do a little bit of development in my free time. like When I learned Rust ah in the fall of 2018, I did that in my ah in my free time. I did that off work hours.
01:44:51
Speaker
right yeah um Yeah, I've been there. When you reach the point where you're craving writing some code, just any code, please give me a keyboard again. Yeah. yeah yeah so And basically in the fall of 2020, I decided, you know what, I actually need to like i need to pull back and start writing some code because I felt then that we had not we we needed to really focus on making some core database advancement and all of our focus for the last year or so at that point had been on higher levels of the database, like other things.
01:45:27
Speaker
was like, well, I'm not going to be able to like manage this process of building out this new core of the database if I'm not like actually in the weeds doing work on it. So that's when I started getting more and more involved.
01:45:41
Speaker
And then over the course of that initial development timeline, again, like it started out where I was writing a ton of code. And then after you know a year, we hired in a bunch of developers. And then I became more of a manager and like reviewing more code.
01:45:55
Speaker
And then gradually over time, I did less and less. Until we finally actually released it. And then we were when we released the first, you know, the cloud dedicated product in April 2023, I switched. I wasn't writing any code at all. Like the only thing I was doing was talking to prospective customers.
01:46:11
Speaker
Right. um You keep accidentally slipping up the mountain. Yeah, yeah, yeah. So, and then again, like when it came time...
01:46:23
Speaker
to start

Balancing Team Dynamics with Project Growth

01:46:24
Speaker
this new flavor of version 3, was like, well, the only way I'm going to be able to do this is if I'm actually writing the code. And and this time, i deliberately kept the team small.
01:46:36
Speaker
i wanted like At this stage, we were five total people. And we actually, for a while, we were just like three. um i I believe basically in the early stages of the project, it's better to have a smaller team. I think even five is a bit difficult. like Honestly, to me, early stages a project, like one or two people is best.
01:46:58
Speaker
Then you add like a three and a four. right but like It's, I don't know, I think there's there's a ah process for like adding more people to a project. It's not, like if you start a project and you have like 10 people on the project, it's like a kind of a nightmare ah yeah when you're just getting started out.
01:47:15
Speaker
um So, yeah, so basically like I've been in the weeds writing a lot of code. I'm still writing a lot of code. yeah. As we move into GA and more of that other stuff, I i very much anticipate that over the course of the next 12 months, I'm going to be writing less and less code. I'm going to be doing more ah speaking at conferences, talking to customers, talking to users.
01:47:38
Speaker
So it all ah it all kind of like ebbs and flows over time, I think. Yeah, it sounds like um it sounds like you are at heart a builder and a startup maker, and you found a way to keep starting new and spearheading new projects, even within one large company, right?
01:47:58
Speaker
ah Yes, but hopefully this will be the last one. i don't think I have the energy to do another startup database, so hopefully this is... Yeah, this is like the one time series database to rule them all.
01:48:11
Speaker
We'll continue improving this one and keep it going. Cool, cool. Well, I wish you the best of luck with it, and I hope they eventually coalesce on a single glorious Git branch.
01:48:22
Speaker
That is the plan. That is the plan. You'll get there. Paul Dix, thank you very much for joining me. Yeah, thank you so much, Chris. Thank you, Paul. If you want to learn more about InfluxDB or you want to try it out with your own data, you will find links in the show notes.
01:48:37
Speaker
If you're ever tempted to build something similar, you might want to check out our previous episode on Apache Data Fusion, which Paul mentioned. It's something of a database building toolkit, which makes projects like that a lot easier.
01:48:51
Speaker
Still hard, but much easier. Again, link in the show notes. If you've enjoyed this episode, please do take a moment to like it or rate it. It really helps other people to find us.
01:49:02
Speaker
And make sure you're subscribed so that you can find us next time. But until then, I've been your host, Chris Jenkins. This has been Developer Voices with Paul Dix. Thanks for listening.