Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Visualize Qualitative Data with Julia Silge image

Visualize Qualitative Data with Julia Silge

S9 E225 · The PolicyViz Podcast
Avatar
609 Plays2 years ago

Julia Silge is a data scientist and software engineer at RStudio PBC where she works on open source modeling tools. She is an author, an international keynote speaker, and a real-world practitioner focusing on data analysis and machine learning. Julia loves text analysis, making beautiful charts, and communicating about technical topics with diverse audiences.x

Episode Notes

https://juliasilge.com/
https://www.tidymodels.org/
https://www.tmwr.org/
https://smltar.com/
https://vetiver.rstudio.com/

Related Episodes

Episode #207: Tom Mock
Episode #201: Leland Wilkinson
Episode #69: Hadley Wickham
Episode #212: Cedric Scherer

iTunesSpotifyStitcherTuneInGoogle PodcastsPolicyViz NewsletterYouTube

New Ways to Support the Show!

With more than 200 guests and eight seasons of episodes, the PolicyViz Podcast is one of the longest-running data visualization podcasts around. You can support the show by downloading and listening, following the work of my guests, and sharing the show with your networks. I’m grateful to everyone who listens and supports the show, and now I’m offering new exciting ways for you to support the show financially. You can check out the special paid version of my newsletter, receive tex

Recommended
Transcript

Introduction to Julia Silge's Work

00:00:12
Speaker
Welcome back to the Policyviz podcast. I'm your host, John Schwabisch. On this week's episode of the show, I chat with Julia Silge. Julia works at RStudio. So if you are familiar with the R programming language, I'm sure you know Julia. I'm sure you know of her work. She was one of the earlier developers of the tidy models package, and she's working on a whole bunch of other things right now. And she has two books that have come out over the last year or so, one on tidy models, one on text analysis,
00:00:41
Speaker
As I mentioned in the first episode of this season, text is going to be a theme that you're going to hear more and more about over the course of the season. So Julie and I talked a little bit, actually not a little bit, a lot a bit about visualizing qualitative data and the challenges that come with that. So you want to hear about that. You'll also want to hear about the new package vetiver that Julie and her team have been working on about machine learning and sort of putting those models out into the world, actually getting them to work in real scenarios and real situations.
00:01:08
Speaker
So I hope you'll enjoy this week's episode of the show. And here's my conversation with Julia.

Julia's Role and Experience at RStudio

00:01:14
Speaker
Hey Julia, welcome to the show. Thank you so much for having me. I'm so happy to be here. I'm so excited to chat with you. We've been emailing for a while. I'm super excited. You have all sorts of stuff going on. Like in general, like I'm sure in addition to your regular busy schedule, you also have like two books out like now, which is pretty incredible. I don't know how you do.
00:01:37
Speaker
two books and work and lifestyle all the same time. So I thought maybe you could tell folks a little bit about yourself. And then we talk a little bit about each of these new books, but then we'll spend a bulk of our time talking about this new work that you're doing. And I'm not going to mention it now. Save it for a little bit. Keep that play button going. What are they going to talk about? What are they going to be talking about?
00:02:01
Speaker
Yeah, so I'm really happy to be here. My name is Julia. So my title now is Software Engineer. I'm a software engineer at RStudio. When I introduce myself, I often say, I'm like, do you just scientist and software engineer? Because people, these certain different kinds of tasks and kinds of skills that I often think about and talk about and write about has that overlap there.
00:02:29
Speaker
Yeah, so I've been in our studio for coming up on three years, two and a half to three years. I've been there for about that long. And it's been no shock, right? Like it's been a really amazing experience working at our studio. It was interesting because it is the first time that my title has been software engineer and that the focus of, let's say, all of my time or most of my time is on tool building. I think I'm someone who's
00:02:59
Speaker
I've always been very interested in people's really practical workflows. How do they do things? I've been involved in tool building either open source or internally at places I worked for a while, but it is pretty interesting to now be focused on that full-time and less as a daily practitioner, I guess, of using the kind of tools that now I focus on building.
00:03:24
Speaker
And I would guess, I mean, even though you've done open source stuff from the past, I would guess R is kind of different in the fact that like there's so much of a community where there's so much openness around

Academic Background and Transition to R

00:03:36
Speaker
it. Like the one thing you build has like 40 other things that add on top of it.
00:03:40
Speaker
Yeah, I actually was not involved in the open source community before R. My academic background is physics and astronomy and my computational background at that point was all C.
00:03:59
Speaker
And I, you know, partly for data analysis pipelines coming out of the telescope and partly for what we now call like embedded kind of work, like to write software to run on the camera, on the telescope to write, you know, software that does the drivers, you know, like moves a little optical components around and that kind of thing. So now that's called like embedded programming, I guess.
00:04:26
Speaker
Um, but so I, you know, I think at the time there, there must have been open source communities, but I perceived them as scary and not for me, you know, what I perceive her world sort of thing. Yeah. Well, like it's like the Linux kernel people. Like that's, that's what I perceive open source as because I, you know, I mean, I think there's some reality to that, both because of the time that it was then. And, um,
00:04:55
Speaker
and what those particular communities are like. I didn't see that as a place where I thought would be a good fit for me and I didn't try and I wasn't interested. That wasn't something that I did before in my pre-data science life.
00:05:15
Speaker
And then kind of, you know, later in life when I've kind of made this like career change and, you know, first I started learning Python and then I found out about R and started learning it a bit too. And, and seeing what the open source communities are like, especially R, I would say, especially R and be like, whoa.
00:05:33
Speaker
Right. This is amazing. I really did when I was thinking about this transition and thinking, I think this would be a good fit for the things that I really like to do and okay, I'm going to learn these new, or I mean not new, new to me, new to me at the time.
00:05:49
Speaker
programming languages that I had not dabbled in before. I really kind of had this mental attitude of, okay, it's time to tough enough, time to gird my loins and go back in there, you know, to like this kind of technical community.
00:06:05
Speaker
Not my experience of it has been just the opposite. Just welcoming, people excited about what you're doing, people interested in what kind of thing might you do. Also interested in, let's call it mentorship or interested in like, oh, I see you built a package. Let me offer you resources for
00:06:24
Speaker
How you might be better at that? How you might get better at that? Actually, no. I really do not have open source history besides art and data science technologies.
00:06:38
Speaker
And I do wanna get to the other stuff, but your origin story is really interesting. So were you like, when you're getting into Python and getting into R, sort of learning the open source world and learning the code, is that what drew you to RStudio?

Motivation and Focus on Tool Building

00:06:51
Speaker
Or was that kind of like, yeah, I now see this as a good positive community, I wanna be a part of it, but that's sort of like added bonus and I wanna go to RStudio because of XYZ.
00:07:02
Speaker
And do you mean go to RStudio in terms of full-time employment? Yeah, I guess so. Yeah. Because RStudio has the IDE. I was like, man, I love this. This is great. This is way better than just using Emacs, like I used to do.
00:07:17
Speaker
Coding and pine. Yeah, right. Yeah, yeah. So in terms of working on open source full time, honestly, it wasn't really, I wouldn't say it was a goal I had or something that I had on my radar as an option. I was interested in
00:07:40
Speaker
data science as a practice and I was interested in building tools that made my literal life better, but also could be reused by other people and we can make things better together.
00:07:56
Speaker
Um, and I, I, I, you know, if you had asked me, what do you envision doing? I would say, Oh, well, I'll probably, you know, be a data scientist and, you know, contribute to open source on the side. And you know, like the employers that I had,
00:08:13
Speaker
over those first couple jobs in data science, were definitely employers who knew the value of open source and encouraged involvement in open source software. So, it wasn't a situation where I was like, well, I will get in big trouble if I do anything for an open source project on the clock. It wasn't that kind of situation.
00:08:35
Speaker
But so I was working as a data scientist and kind of thinking about my next step. And our studio posted this job about working on tidy models. So working on tidy models. So what that really involves is like,
00:08:56
Speaker
Not statistical methods, not like let's implement new methods, but rather let's think about how do people go through the process of a model analysis? How can we build software to make a harmonious,
00:09:11
Speaker
less heterogeneous interface to many kinds of models. How can we build in statistical guardrails to how people go about building machine learning models? And I thought, oh, that is right up. That is right up my alley. I love working on that kind of thing. That's a little bit about process and a little bit about practice. Of course, like I am a math science person by background. I am much more motivated by
00:09:40
Speaker
tools that are a little bit more around process and practice than about say, let's invent a new statistical method that's going to get
00:09:50
Speaker
this tiny bit better than the one before. That's not super motivating to me. I applied to this job just like any other person did. I wasn't the only person who applied. They interviewed. My process of getting a job at our studio was pretty much like getting a job. I wasn't in a situation where they said,
00:10:16
Speaker
Julia, come, we want to hire you. It wasn't like that. It wasn't like that. I applied. I interviewed. I was lucky enough, I feel like, to get offered that job. It's been really fantastic. I'm really excited. I talk to people sometimes who I think
00:10:34
Speaker
maybe have an exaggerated idea of what it is like to work on open source or maybe an unrealistic or rose-colored glasses kind of idea. What is it like to work on open source full-time as a job? Because I do think for some people, you're like, that sounds like the dream. The dream. The thing is, it is a job.
00:10:59
Speaker
I mean, don't get me wrong. It's a great job. I enjoy it very much. It's one that I feel very fortunate to have. But it is, it is at the end of the day, a job, you know? Yeah. Yeah. Yeah. I mean, it's interesting to hear you talk about the workflow, because it seems like all the stuff that we're going to talk about
00:11:21
Speaker
is definitely in that vein. And I would guess that the workflow around data and tech and coding has changed dramatically. I mean, even in the three years or so that you've been at RStudio, the way people are working with data has changed.
00:11:36
Speaker
Okay.

Challenges in Data Visualization

00:11:37
Speaker
Yeah. So you've got two books coming out. You've got one book on tidy models, which you've already talked about. Now I have max, your coauthor max Kuhn is going to come on the show, uh, later this season. So we don't need to dive into all that. He can do the sales job on tidy models. Um, you've also got this book on text analysis, which is really exciting. So I want to make sure that we get to your current work. So I'm going to ask you just one question on the text analysis book and the other stuff on time. So I want to ask you.
00:12:07
Speaker
Do you think data analysis, data visualization is harder with qualitative data or with quantitative data?
00:12:16
Speaker
That's a great question. I think that an important thing to realize about that sort of comparison, that sort of, hey, how do we think about rectangular data versus unstructured data is to realize that when it comes to computers, no matter what you're doing,
00:12:41
Speaker
Python are, you know, whatever, whatever you're doing, for you to be able to summarize, visualize, or eventually train a machine learning model, you have to get that unstructured data, that qualitative data.
00:12:58
Speaker
Into some kind of structure into some kind of shape like I like to kind of think of that way like you have to like if you think about your say raw text data or other kinds of qualitative data as I don't want to say shape less but like very unstructured and their shape if you're going to
00:13:16
Speaker
do some kind of analysis, any kind of analysis. You have to transform that text or other kind of qualitative data into an appropriate data structure for whatever it is you're trying to do.
00:13:31
Speaker
So I think I've been thinking about this, actually, because so the first book that I wrote with Dave with David Robinson, text mining with ours name of it. And it's a book that's really about EDA, like exploratory data analysis, but for text.
00:13:50
Speaker
It adopts an opinionated take that having your text data, transforming it into a tidy data format where say you have one observation per row. That sets you up for success in terms of...
00:14:07
Speaker
in terms of the tasks of exploratory data analysis, whether that's I need to summarize, I need to visualize. So that tidy data structure is one that sets you up for being able to flexibly take a lot of different kinds of tasks or approaches. And it's good for the same reason that just any kind of tidy data is good. Right.
00:14:36
Speaker
When it comes to training a machine learning model, we got to transform, right? But often the best end shape or data structure that we're going to head to is not say something that looks like a table, like a table in a database or like a tidy data frame, but rather that something that looks like a matrix, something you can do math on.
00:15:00
Speaker
basically any of the machine learning algorithms are gonna use some kind of big matrix and do some matrix math or some other kind of approach there. And so there again, we have this transformation that has to happen, but it's a bit of a different one where we need to end up in a different kind of situation. And much like, say the transformation from raw
00:15:27
Speaker
Kind of unstructured text to a tidy data that transformation to a you know something you might think of as like a document term matrix or or just some kind of matrix representation of it. Decisions that you make to get from one to the other really impact.
00:15:46
Speaker
what you can learn, in the literal statistical learning sense and or the more conceptual like what am I trying to do here with this data. That is really the focus of the book that I wrote.
00:16:01
Speaker
It was published last year, yes, with Emil Wiefeld. And so that's called Supervised Machine Learning for Text Analysis and R, bit of a mouthful there. And that, like the fully the first third of the book is about feature engineering for text, which is just exactly that process of I have unstructured natural language and I need to transform it into a representation that a machine learning algorithm can, you know,
00:16:30
Speaker
can do math on. Right, right. I had a call yesterday with people who, you know, with a team that, you know, they have qualitative data. And they were sort of asking me for help with some of the pieces like, well, you know, if you if you wanted me to, you know, help you with your bar chart or a map, and you sent me a table, you know, of data, I could sort of play around with it. But with qualitative data, I felt like,
00:16:53
Speaker
I really need a content expert to go through and tell me what the themes are. And I guess if you can structure that data and pull out the themes using machine learning, you can then sort of share that information with lots of people to say, what's the best way to then represent it?
00:17:10
Speaker
Yeah, I think that's a really interesting question that I've definitely run into in different situations or jobs. That idea of I have unstructured data and what do I do with it?

Handling Unstructured Data Approaches

00:17:22
Speaker
And I think I would say the two big sort of answers to
00:17:26
Speaker
to what can I do with this unstructured data? Like the first big answer I think is yes, content experts, experts who domain experts who know something about this, I'm gonna have them label this, I'm gonna have them create themes, you know, and I've worked with really fantastic qualitative researchers who have this skill of say, I'm going to spend,
00:17:52
Speaker
you know, time doing either top down or bottom up, kind of like categorization of what these things are about. And then once we have those annotations or labels or whatever you want to call that, then we can use that to
00:18:09
Speaker
You know, maybe you can use that as training data. Maybe you can use that in and of itself to be able to learn something. So that's the first sort of answer to what am I going to do with unstructured data? I think the second answer uses quantitative approaches. So uses
00:18:25
Speaker
something like unsupervised machine learning approach like topic modeling or uses some kind of supervised machine learning approach to be able to predict some kind of label. And then you use these methods of text analysis like
00:18:41
Speaker
tokenization, identifying important words. You basically transform the data, like what I was talking about before, probably some sort of matrix representation, and then use some kind of statistical method to be able to learn from it.
00:19:00
Speaker
in my experience, those two options, like what is right to choose depends a lot on how much data you have, right? Like if you have, let's say you have less than 1000 observations, then it's like, yeah, you probably are gonna need to read those, you know, like you're not someone's gonna need to read those and
00:19:21
Speaker
and do some principled qualitative analysis on it, and yet above about a thousand, and then you can start using some of these quantitative approaches. The thing about text is, you say you have a thousand observations, but the thing you're observing usually is some kind of token, like whether that's a single word or something else in there, and depending on the vocabulary that people use,
00:19:50
Speaker
That can actually mean you don't have a thousand observations. You have maybe many more observations than that. You haven't observed each thing very many times because the way natural language works, some words are used. There's a few words that are used a lot and then it's power law in terms of
00:20:10
Speaker
most of the words are used very few times. And so you actually haven't observed those words very many times. So it depends on the specifics of the vocabulary that people use and how it is used, but truly natural language where people just kind of use all the vocabulary that they have access to, ends up in a kind of interesting situation for, let's just call it the counts, like the counts of how many times you observe different things.
00:20:36
Speaker
Yeah, I'm guessing tokenization is one of those words that's at the tail of that. Yes, yes, yes. I've already said it like three times. Right. It'll be the only time in all the transcripts of this show that'll occur. So, you know, it's right. Okay, so all of this, I think segues very nicely into your current work.

Introducing the Vetiver Package

00:20:57
Speaker
which is on the vetiver package and this is not an area that I'm familiar with so I am just going to ask you to explain it to me because I'm guessing people who are listening to this are also not as familiar with it but it sounds like it's got the combination of all the things you've talked about if I'm if I'm understanding it's got the workflow piece
00:21:17
Speaker
It's got the open source piece, obviously, and it's got the text and the machine learning and it seems my mind reading of it. It's working all of those into this like closing the circle on the workflow.
00:21:28
Speaker
Yeah. Yeah. I like thinking about it that way. This is something that I've been talking about since I was hired, actually, at RStudio. One of the things that we talked about when I was hired was, okay, you're going to come, you're going to work on tidy models packages, but one of the areas that we know
00:21:49
Speaker
For example, we would get questions about when we would do trainings on tidy models. It's like, okay, well, once I have my train model, what do I do with it? If I have some kind of predictive model or
00:22:04
Speaker
some kind of machine learning model, what is it that I do when I'm done? And there is this narrative that R is not good for production, R is not good for
00:22:20
Speaker
real work or work that you need to scale, something that you need to scale. But it turns out actually that some of those same tasks are difficult in Python as well. The difference is maybe not as big as some people expect or would like to see that the process of taking a Python model and putting it into production is actually also
00:22:48
Speaker
Fairly difficult. Actually, also, there's a lot of questions around what do you do? What are good practices? And so, as I spent more time at our studio, it really... So, I express interest in working on these kinds of tools because I think... And you're exactly right. The big reason why it appeals to me is that I do love working on really applied
00:23:12
Speaker
problems on really practical sets of tools or sets of analyses that people have. You've trained a model, let's say you use really good statistical practices to do a great job training a model that is
00:23:34
Speaker
reliable and robust and you understand its predictions. But that question of like, what do you do afterwards with that model is where vetiver sits. So vetiver is not about developing a model. Vetiver is about what you might call model ops.
00:23:55
Speaker
or MLOps tasks. So these are tasks like versioning your model. These are tasks like deploying your model. And these are tasks like monitoring your model. So your model is done being trained. And not all models are deployed, right? Like some
00:24:13
Speaker
models are trained in the book with Max. We kind of outline an ontology of models where models can be built for different purposes. Some models are built to describe data. Some models are built for inferential purposes. So you want to look at, say, the coefficients of the model, and you want to communicate something or explain or understand some situation based on the coefficients of the model.
00:24:41
Speaker
another big reason why models are trained are to be predictive, to be predictive models. That's typically where it's not very useful to have a predictive model and not be able to deploy it somewhere and put it into, quote, production. People, I think, also talk about, hear that and either feel scared or confused because they're like, what does production mean? Yeah, right. Yeah. What does it mean? What does it mean?
00:25:10
Speaker
What does it mean to put a model in production? What does production mean in general? One way I like to think about production is that your, in this case, model is portable in the sense that you don't say you trained it, say, on the computer sitting on your desk, right? And so you have that model on the computer on your desk.
00:25:32
Speaker
But to put it into production is to make it useful in a different computational environment, like to make it useful to other users than you. So kind of by that definition, one way you could think about or visualize putting a model into production is say like building a shiny app
00:25:53
Speaker
that allows people to, like human beings to kind of like move sliders or change, you know, change inputs and to see what does the model predict? Like if I do this, what does the model predicting to get a prediction out? So, and then if you, if you, you know, deploy that shiny app somewhere, like you put that shiny app somewhere, I would call that putting a model into production. Right.
00:26:16
Speaker
Most often when people put models in production, what they're doing is they're making their model available, not necessarily to a person to be able to, say, move a slider and see how the prediction changes, but probably to make the predictions of that model available to IT infrastructure. So say, you know, you have
00:26:38
Speaker
say, a business selling widgets and you want to have a model that helps you decide, predict what is the most likely widget that this customer wants.
00:26:54
Speaker
And so you don't necessarily want a person to go and look, but rather you want in a more automated way for whatever system that's facing the customer to be able to say, ooh, hey, this is the one I think that is the best. Or like, here's the highest probability category or whatever that this person is interested in.
00:27:14
Speaker
And so usually when people say put a model into production, what they mean is they take the model, package it up, and make it available so that the systems that you have can interact with the model and get out what they need. And if you ever heard the term microservices, what that means is just like
00:27:34
Speaker
Let's separate those things out so that each one of them is its own little computer, basically. Its own little piece, so that you don't have them all together in one system. The system that shows the customer something is separate from the system that makes the prediction, but they all can talk to each other.
00:27:56
Speaker
Yeah. And the way that most of these things talk to each other, in most situations, is by RESTful API. So if you've ever said, oh, I'm going to use, say, in our the hitter package, and I'm going to do a get request or a post request, you use HTTP calls for one system to be able to talk to the other system. So that's when most people say,
00:28:21
Speaker
put a model in production, that's usually what they mean. They want to have the model somewhere so that other parts of your infrastructure can say make an ACZP haul and get back the predictions that they need.

Model Versioning and Client Feedback

00:28:35
Speaker
The predictions from the first part. And then on the version control, so does vet ever have a version control embedded within it? Or do you need to also be using a GitHub type solution to also have that piece?
00:28:50
Speaker
So this is an interesting question. How do you version data? How do you version models? So let's talk about just models. Git can be used to version models, but they sometimes get a little big because you don't, it's not like,
00:29:07
Speaker
text. They're not plain text. They're usually binary objects. And if you update it, like you retrain it with new data, it's not like text where you're like, ah, let me, I can diff it and I can see that this line has changed. No, the whole binary file has changed. And so get, um, can be used for that kind of purpose, but it's,
00:29:30
Speaker
It gets big and sometimes it's not very, you know, you don't get like the reasons why you typically use gets, um, cause of all the diffing, all the line by line, all that kind of thing. Like it doesn't really apply to a model. So there are some tools that use get for versioning data indoor models. One that I like is called DVC data version control.
00:29:52
Speaker
But that's actually not the approach that we took. So we took an approach that's a little bit more like think about you have a way to store files somewhere. And then let's just put like a little bit of a layer on top of that so that you can version so that you can version and like have access to all the versions you need in the past attach kind of some some metadata that's like appropriate metadata for a model and that you can kind of switch out back ends and kind of a flexible way.
00:30:22
Speaker
So we use, it's actually the same approach that the Pins package takes. So Pins is a, there's an R and Python version of Pins and it's, the metaphor here is like you have a board, like picture a bulletin board and then you pin things to the board. And then it's like, oh, it's time for me to make a new version of that. You kind of like pin a new version on top of the old version. Gotcha. Yeah.
00:30:46
Speaker
And so where are you actually keeping it is quite flexible. You can pin, you can write a pin to a cloud platforms blob storage, like S3 or Azure blob storage or something like that. You can use, if you use RStudio's Pro products, you can use Connect as the storage device. Here's where we come and get these versioned storage pieces.
00:31:13
Speaker
You can use, if you're in a situation where say you have a shared network drive or something, you can use that. You can use that. It looks the same to the user, but you are deciding where is it that this goes. It's a little bit of a bespoke versioning approach that is meant to have flexible backends and give you just the right amount of abstraction around it where it's straightforward,
00:31:40
Speaker
to get the right version, it's straightforward to share it with the right sets of users, that type of thing. So do you now get to, or do you and your team try to work with RStudio clients, I guess, to see how they're implementing it and where the blockages are and where they need improvements?
00:32:02
Speaker
Yeah, that's a super valuable piece of feedback for us. We work on open source. I am on the open source team. Me and my team work on open source software, but because it is so much about
00:32:20
Speaker
deployment, which is really about where does my data artifact and work, where does it finally come into contact with the infrastructure that I'm working on?
00:32:36
Speaker
involve very, like I've been saying, very applied, very practical. How are people really doing this? Before, very early in the process of all of this, I spent some time doing
00:32:51
Speaker
doing a round of user interviews. I had a little questionnaire I had put together to try to understand what were people doing now? What was working for them? What was not working for them? Where did they see their pain points? Where was collaboration not working? Yes, kind of like falling apart. Yeah, yeah.
00:33:12
Speaker
And I so appreciate all the people who are willing to talk to me. So that was a mix of people who are RStudio customers and people who are from the community who had been trying to, say, put a model into production. And I really appreciate those people's time and expertise and perspectives, because it did inform what we eventually decided to build, which
00:33:38
Speaker
It's different than some of the other options that are out there in that it's really flexible what you bring. What kind of model you bring is where we put the hooks, where do you enter into this process? We built that to be very flexible. Hopefully, we have the options for, say, the 80% case, and then it's extensible and customizable for, say,
00:34:07
Speaker
And I mean that in like a real substantive way. For multiple teams doing multiple different things. Yeah. Yeah. Like you mentioned earlier, sort of like having this piece and this piece and this piece. That's not just tools, right? That's people working in their own little pieces. And to try to resolve that or fix that, it's like a good old industrial organization challenge that. Yeah.
00:34:28
Speaker
No, for sure. That was all early. Since we have had something to show people, we do pretty regular demos or getting feedback sessions where we show people a demo and say, what questions do you have about it? What do you think would work about this?
00:34:50
Speaker
are you still looking for you know and not that we can do we can be everything to everybody because we you know we can't we can't but like to understand what the most common um use cases are or difficulties right um so one last question for you are you making vetiverst t-shirts because that's like the obvious
00:35:13
Speaker
extension of it.

The Metaphor of 'Vetiver'

00:35:17
Speaker
It's a real word, vetiver. It is an ingredient in perfumery. I'm into perfume and I'm into fancy candles. If any of your listeners are like, vetiver, I feel like I've seen that or heard that somewhere. It probably is from a fancy candle or something like that.
00:35:40
Speaker
It's a stabilizing ingredient in perfume. It does smell good on its own, but its main purpose that is used for is that if you take more volatile,
00:35:54
Speaker
you know, scents and fragrances, Vetiver will help stabilize them so they don't just like, you know, go away and evaporate off. And so the, and so the metaphor here is right. Like Vetiver helps you stabilize your, your models here are like, are these like sort of more, um, volatile kind of components and Vetiver stabilizes it so that, you know, you have, you know, the version you need, it is reliably deployed in a way that you can, you know, get to the predictions and also you can monitor how it's doing over time.
00:36:23
Speaker
Gotcha. So in doing research on this before deciding for sure on the name, it was it was kind of appealing that it was a bit of an unusual word, right? There's not a ton out there. But I think there is something out there called the vetiverse because I think there's like a there's like a band or something. But it's a pretty uncommon. Yeah.
00:36:44
Speaker
Okay, but if I'm ever out and I see like a perfume store or a candle store named vetiverse, I'm going to expect you to be in there just like selling your like custom made candles and stuff. Yeah, that's right. With our studio stickers like on the bottom of it with the price or something. Perfect. Perfect. I love that.
00:37:02
Speaker
Julia, thanks so much. This was really interesting. I mean, I just learned a lot. I'll just say that. Really fascinating stuff. Congrats on getting it out. And I'm excited to see how it starts to be picked up and find itself into the open source community and see what folks do with it. It'll be really interesting to see. Yeah, we're super excited about that too. Great. Well, thanks again for coming on the show. It's been great chatting. Thank you so much for having me.
00:37:28
Speaker
Thanks everyone for tuning in to this week's episode of the show. I hope you enjoyed that. I hope you'll check out all of Julia's work on her website at RStudio and in the R packaging language. I put links to all of his packages that we talked about and her books in the show notes. So I hope you'll check that out. And I hope you'll check out policyvis.com where you can learn more about visualizing and communicating your data. So until next time, this has been the policyvis podcast. Thanks so much for listening.
00:37:54
Speaker
A whole team helps bring you the Policy Vis podcast. Intro and outro music is provided by the NRIs, a band based here in Northern Virginia. Audio editing is provided by Ken Skaggs. Design and promotion is created with assistance from Sharon Satsuki-Ramirez, and each episode is transcribed by Jenny Transcription Services. If you'd like to help support the podcast, please share and review it on iTunes, Stitcher, Spotify, YouTube, or wherever you get your podcasts.
00:38:17
Speaker
The Policy Vis podcast is ad-free and supported by listeners, but if you would like to help support the show financially, please visit our Winnow app, PayPal page, or Patreon page, all linked and available at policyvis.com.