Oops!Zencastr was unable to start because Javascript is disabled
To fix this problem, check your browser's settings and enable Javascript

Become a Creator today!Start creating today - Share your story with the world!

00:00:00

00:00:01

/ML basics: learning by doing

The Forward Slash Podcast

54 Plays6 months ago

Curious how AI can turn messy, unstructured data into something actually useful? This episode, we dive into a real-world project that explores just that. Senior developer Azriel Alvarado shares how building a song recommendation tool based on lyrics led to deeper insights into embeddings, vector search, and working with natural language data. Whether you're in search, content tagging, knowledge management, or building smarter user interfaces—this episode is packed with takeaways for applying AI in meaningful. ways.

Recommended

/AI strategy: back to business basics image

/AI strategy: back to business basics

The Forward Slash Podcast

01:09:42·22 days ago

/outcome engineering: the end of feature factories image

/outcome engineering: the end of feature factories

The Forward Slash Podcast

01:07:00·29 days ago

/AI adoption: from FOMO to fundamentals image

/AI adoption: from FOMO to fundamentals

The Forward Slash Podcast

00:59:46·1 month ago

/digital health: rewriting the rules of care image

/digital health: rewriting the rules of care

The Forward Slash Podcast

01:02:17·1 month ago

/engineering ops: orchestrating AI at scale image

/engineering ops: orchestrating AI at scale

The Forward Slash Podcast

01:04:28·1 month ago

/product strategy: adapting to AI image

/product strategy: adapting to AI

The Forward Slash Podcast

00:52:30·2 months ago

/AI at scale: the hidden costs of the cloud image

/AI at scale: the hidden costs of the cloud

The Forward Slash Podcast

01:06:27·2 months ago

/dev productivity: beyond vibe coding hype image

/dev productivity: beyond vibe coding hype

The Forward Slash Podcast

01:13:40·2 months ago

/AI talent: why juniors still matter image

/AI talent: why juniors still matter

The Forward Slash Podcast

01:13:46·2 months ago

/digital transformation: inside the CIO playbook at Fifth Third image

/digital transformation: inside the CIO playbook at Fifth Third

The Forward Slash Podcast

01:10:49·3 months ago

/AI development: no engineers required image

/AI development: no engineers required

The Forward Slash Podcast

01:30:28·3 months ago

/crypto IRL: what happens next? image

/crypto IRL: what happens next?

The Forward Slash Podcast

01:15:57·3 months ago

/platform engineering: rethinking complexity in tech image

/platform engineering: rethinking complexity in tech

The Forward Slash Podcast

01:14:04·3 months ago

/tech meets money: a new way to budget image

/tech meets money: a new way to budget

The Forward Slash Podcast

00:54:44·5 months ago

/coding culture: from nerd to necessary image

/coding culture: from nerd to necessary

The Forward Slash Podcast

01:19:47·5 months ago

/design culture: too many cooks image

/design culture: too many cooks

The Forward Slash Podcast

01:15:44·5 months ago

/API sprawl: herding the chaos image

/API sprawl: herding the chaos

The Forward Slash Podcast

00:54:44·6 months ago

/technical storytelling: speak so they listen image

/technical storytelling: speak so they listen

The Forward Slash Podcast

00:44:26·6 months ago

/generative AI: beginner's guide image

/generative AI: beginner's guide

The Forward Slash Podcast

00:40:36·6 months ago

/reinforcement learning: feedback is fuel image

/reinforcement learning: feedback is fuel

The Forward Slash Podcast

00:36:37·6 months ago

Transcript

AI vs Traditional Programming

00:00:02

Speaker

It reminded me of of something that i read a while back when I was getting into AI. AI or machine learning style like programming is different from, departs from traditional programming in that for traditional programming,

00:00:19

Speaker

You define the rules and the input. You to define the rules and then you provide input and then you'll get like the the machine produces output, right? But with this kind of stuff, you give it examples like input, output, like expected output, and then the machine will to figure out the rules for you.

00:00:39

Speaker

Sometimes there's like a lot of rules that we don't fully understand or it might just not be feasible for us to figure out, right? like And so there's this type of problems which I feel is just AI just solves the best and it might just be the only way to solve a lot of these problems.

Introduction to Forward Slash Podcast

00:01:02

Speaker

Welcome to the Forward Slash Podcast, where we lean into the future of IT by inviting fellow thought leaders, innovators, and problem solvers to slash through its complexity. Today we have with us Azriel Alvarado. Did I say that right, Azriel?

00:01:14

Speaker

Yeah, that's right. All right. Azriel is a senior software developer, Android and Kotlin enthusiast, master tinkerer, and really good at not reading documentation and learning things the hard way.

00:01:28

Speaker

wow we must be related.

00:01:33

Speaker

i think I'm the the same way. I don't, I don't like reading documentation. i like tinkering myself. And then when I do read the documentation, like Oh man, I wish I'd read that a month ago and would not gone through that whole path.

00:01:44

Speaker

Oh yeah. Yeah. You can, as it The saying is you can save six hours of debugging by reading documentation for five minutes. Something like that. Yeah.

00:01:56

Speaker

Nah. Well, welcome to the podcast, Adriel. All right.

AI Project: Song Matching Tool

00:02:01

Speaker

so um What we're going to talk about today, i know you had a somewhat recent experience. ah You know, this this is, again, our our our May. We're doing a but lot of stuff with AI this month. We're calling it our May-I an initiative.

00:02:17

Speaker

I wanted to talk about, i know you were one of the one of the very early people to kind of just dive in with... ah AI in general here at Caliberty and, and just doing, like you you said, you're a tinkerer. You were tinkering on your own, just, just building something. You, you found a problem and you, you wanted to solve it and you're like, let me do it.

00:02:35

Speaker

So tell us a little bit about like, what, what was the problem at hand that you were trying to solve with, with the solution you built? So my girlfriend is really good at like identifying songs,

00:02:49

Speaker

um which fit moments. So, I'll be on doing whatever and she'll be like, oh my God, this song it is like just like this moment. like, oh wow. And so we I look at the lyrics and it's like spot on.

00:03:01

Speaker

And so I thought it was pretty cool. And it came a time where I wanted to like kind of do that myself as well. i And so I was like, man, how can i I was at the time also interested in AI. So I was like, maybe I can use AI for this. Because I found myself like asking...

00:03:21

Speaker

ChachiPT sometimes, like, what's a good song for this situation or whatever? like, thematically similar songs to this or that or the other. And so um i decided to write my own tool for doing that, but with my own playlist songs. Because ChachiPT just gave me, like, really generic songs.

00:03:42

Speaker

And so like wanted to to kind of, like, have something that would do the same thing, but with my playlists. So... I also really wanted to write like a command line interface tool. I've been wanting to write one of those for like a long time because i just love the way that the UI looks in the command line. So I built the command line tool, which will allow you to load your songs from a playlist on Spotify.

00:04:04

Speaker

It'll also web do some web scraping to like pull the the lyrics for the songs. and then it will use an encoder, which is part of a large language model, um which is used to quantify the meaning in text.

00:04:25

Speaker

Okay, so I see you're you're trying to find songs. What kind of tools did you did you get yourself into when you're you're and starting to build this solution? like what What did that even look like? what did what what did What were you digging into? What were you reaching for as far as tooling?

Using Google's Universal Sentence Encoder

00:04:42

Speaker

So in my case, I used one called Google's Universal Sentence Encoder. So basically you feed that thing like some text that can be a sentence or part the lyrics or whatever, and it'll like give you, I think it's 512 dimensional vector, which represents the meaning of that, whatever the text you you gave it.

00:05:04

Speaker

And so it'll encode that, it would store it in a vector database, which is basically Postgres with PG vector extension and it would load up my my songs, load the lyrics, the meaning of the lyrics, the quantified meaning of the lyrics and embeddings and then I would feed it like you know user input. I'd be like, a I don't know, some astronaut is lonely in space or whatever.

00:05:34

Speaker

It'd be like, you know, major top. um And so that's but was That was kind of the idea

Challenges in Song-Matching AI

00:05:43

Speaker

behind it. And it actually worked.

00:05:47

Speaker

like it got I got it working. um and After a lot of pre-processing. but Because the lyrics thing is like kind of tricky.

00:05:58

Speaker

At first I was like, oh, I'm just going to plug the lyrics in this in this thing. It's going to give me the meaning whatever. But it turns out that lyrics can be like really weird. well i i got So hono I've got a challenge. that do you think your girlfriend will listen to this episode?

00:06:11

Speaker

but I'm going to challenge her her expertise here a little bit. I'll have her listen to it. Well, I don't mean like I'm better at it. So when you explain that, it's like, have you ever seen the movie City Slickers?

00:06:23

Speaker

City Slickers? It's like from a long time. There was a scene where like these two guys are supposed to be like Ben and Jerry, right And there's like something in ira or something. I don't remember their names. But like the the the thing was, he was like, oh, yeah.

00:06:34

Speaker

my my brother here can name the best ice cream to go with any meal. Just try them. And then they would name a meal and he's like, oh yeah, scoop of chocolate, scoop of you know vanilla. And they're like, yep, that's right. they're like, wait a minute.

00:06:47

Speaker

Come on. all Give me an example of like how your girlfriend is good at this. this Like like give me give me some context here. Is this this true or is she just making it up? Yeah, no, I see what you mean. Like, ah because sometimes I have my same touch with like AI, right? Like it'll just give the kind of like generic answers, right?

00:07:04

Speaker

We have like cats and one of them is like really mean. And so think it was, she, she mentioned, she did something crazy and she was like, oh that's like this print prima saw interview primus you've heard primus.

00:07:22

Speaker

Of course. um It was Tommy, the cat, Tommy, the cat from primus. So that one was kind of like an easy one. That's the one i remember the most because it was easy. But I promise you there was harder ones.

00:07:33

Speaker

so All right. was not I appreciate. Just the fact that you would like recommend Primus is impressive to me. that Okay. All right. i'm She can be an expert. and We will accept her her testimony in the in this court as an expert witness. Okay. All right. we so So we're building a solution. We're trying to like replicate your girlfriend's talent of naming songs that are perfect for an occasion. Right. Okay.

Cosine Similarity for Song Matching

00:07:58

Speaker

So you've, you've downloaded your songs, you've downloaded some lyrics and you're, you're, you've taken the, the, the, basically the text of the lyrics. You're not doing anything with like sound processing or anything, right? You're just doing the lyrics, just lyrics. Yep.

00:08:11

Speaker

Okay. And you've, you turn them into a vector. So that's like when we're doing like with documents, we call that, that's called like an embedding, right? That's yeah embedding. Right. right I know one thing when I'm,

00:08:24

Speaker

like when I was learning about embeddings, I was shocked. Like you would think that like, oh yeah, you need this this embedding vector so that you understand all the semantics about this. And what what does this what does this text even mean?

00:08:35

Speaker

And you hear, as you said, oh, it's a 512 element vector or or feature vector. like, how in the world do you get all that context and that richness into only like with only 512 numbers stacked up together? just in that That's just wild that they can do it with so few numbers actually.

00:08:55

Speaker

Yeah, you know i mean and that's probably like the lower end, I think. and um i mean I think that it goes way higher, too. I saw one that's like 1,000. But i so far, like I've tried to mostly play around with smaller the smaller models, mostly because it's easier to like run them locally.

00:09:15

Speaker

you know They're faster also because I'm a mobile developer. And so like part of what i'm I'm aiming towards is being able to, like, you know learn how to use these models and then eventually maybe do some fine tuning and load them onto like an app right so they can run locally on a mobile device. I think the smallest ones, some of the smaller end ones are like 250 megabytes, something like that. It's still kind of heavy. but Yeah.

00:09:42

Speaker

no not running that on my iPhone. Not much. Well, you're an Android guy, so you wouldn't worry about that.

00:10:02

Speaker

All right. So we're, we've uploaded lyrics. We've got it. we've We've created embeddings. We've saved those embeddings into a vector store. And now you're, and now you're, you're what are you doing at this point?

00:10:13

Speaker

Right, so now um I just, those are still in the, those are just sitting there in the in the vector database. And so I take user input, so like I run the program, right, and I prompt the user for input.

00:10:26

Speaker

User inputs, you know, whatever song lyrics, like rock music causes, you know, riots or whatever, revolt. and And then I press enter, and so that, the program is going to take that text, it's going to do the same thing to it, so it's going to create an embedding for that text.

00:10:45

Speaker

And then it's going to perform a query on the database ah which performs this operation iss called cosine similarity, ah which basically, you know, it's it's a math, but it basically finds the... It it it compares that vector or that embedding with the embeddings of the songs in the database, and it'll find the one ones that are most similar, um which should mean that they have similar meanings.

00:11:21

Speaker

ah So... the but If the embedding represents the meaning of the text, then you know whatever vector is closest to the vector of the input text should represent text that is most alike.

00:11:36

Speaker

So it's more similar. um So that performs that that query. And that that happens like out of the box. you You don't actually have to write the code for cosine or anything fancy like the the pg vector extension in Postgres does that for you.

00:11:50

Speaker

So you just have to, like, you know, it it gives you, like, a special query command to use. So it does that automatically. And so it'll just give you, like, don't know, the top five matches.

00:12:02

Speaker

um And so that's what I return. I return top three or something. You can use as many as you want, but I just use three.

00:12:13

Speaker

And so, yeah, in the case of my example, it would return rock the casket. Nice. Good song too.

00:12:23

Speaker

So your, your solution, you, you stopped short of going, you know, back to like the, the, the language model itself and, and kind of doing the rag thing, right? Where you, where you're, you search your vector store and get back the lyrics that sound like they should be.

00:12:37

Speaker

And then you, tend you send it to an LLM and say, tell me some things about this. You, you're just using the similarity search as kind of a, a quicker way to get at the information you need.

00:12:48

Speaker

Right, right. I'm not doing any generation, so I'm not using the other, um I'm not feeding it back into the ah decoder piece of the large language model to have it generate something for me.

00:13:01

Speaker

just leaving it at that, leaving it at that because i don't I don't need to generate anything. i already know the song names and all that, so. So it's not so much rag.

Embeddings for Semantic Search

00:13:08

Speaker

It's more like Right.

00:13:12

Speaker

You're just R. Retrieval. It's a pirate. You're not pirating music, are you? i'm No. yeah Certainly not. We don't do that anymore. do That was a long time ago. Nobody does that.

00:13:23

Speaker

Okay. So your not so how would how did you find the... how effective that was because i i that was my um i have done that as well but i i had mixed results depending on you know different embeddings and whatnot how was your experience with that so um also mixed like what i found was yeah there's a there's a lot of work that goes into like pre-processing and like you can make it easier on a model to like do what you want to do if you do the right pre-processing like

00:13:57

Speaker

So for at the at first, I was like, I'm just going to feed the entire lyrics. And that's not good um because i think it kind of like, you get like these watered down meanings, like embeddings, right? Because they're representing the meaning of the entire song and songs, like an entire song can like touch on various subjects, right? So that didn't work out super well. It worked better when I kind of like,

00:14:25

Speaker

split it into lines and, um, removed like a lot of stuff that wasn't like a noun or an adjective or things like that. So like, you know filler stuff, you know, they'll, they'll have like the yes and the oohs and the ahs in there, like get rid of all that.

00:14:43

Speaker

And, um, and ah just calculate, calculate embeddings for each line and then use an average, an average for that. So that there'll be like the average of, of all those embeddings.

00:14:58

Speaker

So that worked a little bit better for me. um I got better results and you know, like I saw, I've also found out a lot of trial and error. Like it's not, And and partially part but partly because I'm new to this, so I don't understand like all of our inner workings, but I felt like overall, like fundamentally, it's more of a trial and error process than traditional programming.

00:15:22

Speaker

Where like, i'm like oh I'm going to try this, and I kind of like have a sense of what that might do for me, and then I just you know do it, and then I'll get different results or measure you know results and how it performs and all that. so um But yeah, overall, the performance of it at at the end, like I was able to make some improvements, but still it kind of struggled to catch on to the abstract meaning in lyrics, which is very poetic and can be like highly the abstract.

00:15:55

Speaker

So like I would sometimes have to... It would perform best when I would like actually include words that would be in the lyrics, right? So if i there if there's a song that talks a lot about, i don't know...

00:16:06

Speaker

you know plants or trees and i include the word tree in my problem like in my input then it'll do way better um so yeah that's kind of what i what i what i found yeah i i found like the the embeddings are they're pretty good at like having like generic meaning so you don't have to like take the text and say, you know turn trees into trees. It's not doing an exact like keyword match or anything, but it does ah develop some semantic meaning. But I um found the same thing as you, like the closer you are to the exact words, the better i I found I was able to actually get results out that I would, that I would say that I would expect. So it felt more like search, like just raw search as opposed to like meaning search over time.

00:16:57

Speaker

Yeah, it kind of felt like that. It's definitely a bit more powerful than your standard search traditional searching tools, but and still like I think that without further fine-tuning for the specific case, like the kind of input that you're giving it, I mean, I'm not sure how exactly Google's Universal Essentials Encoder was trained, but probably not with a whole lot of like poetry.

00:17:24

Speaker

I mean, that's a good point. Um, so it probably has a hard time like doing that's probably more with like, you know, standard, uh, standard sentences, you know, communicate more concrete things.

00:17:38

Speaker

Um, you know, cause it's like, i said its is it's a, it's a relatively small model. Like in and in the context is only like 512, um,

00:17:47

Speaker

you know dimensions of vector space or whatever. And so I don't expect it to be able to do like a whole lot beyond something generic. um so So yeah, that was my experience with it.

00:18:04

Speaker

I also did some fine-tuning with some other stuff. Because I was like, man, have to do some fine-tuning on this to make it more more effective. but I felt that that problem space for like lyrics thing was, was too challenging for my level. So I did it with something else, like a different problem.

00:18:24

Speaker

So like, as you were digging into this, like, you know, how was the, you know, at first I'm sure it was like, okay, I'm, I'm, I'm, you're a tinkerer anyway. So you're, you're, you're, you're rising to the challenge, but like, as you're, as you're digging in, like, I know it's probably like, okay, where do I even start with this stuff? Like, how did you find that experience of like getting quickly to the information you needed to, to kind of accomplish what you were trying to do? I mean, there's just so much stuff out there right now with generative AI and all that. Like how did, did you find that easy to do or was it hard to weed through everything?

Using Hugging Face and ChatGPT

00:18:57

Speaker

For this part, right, so for the for using you know the the music search, lyric search application, it wasn't that difficult um because most of most of the work was like really standard programming stuff.

00:19:17

Speaker

I mean, the ah really where where the AI knowledge came in was in selecting a model. It was in selecting a model and kind of like understanding some minor details of like, okay, well, how do I kind of like set this up with, you know, how how do set up like the vector database to support what the model is is outputting, right? which So it was like, I had to make sure that the the vector database but used like the proper dimensions. Like, oh, it was 512 size vectors, right? And this model use and you know outputs the size vectors.

00:19:48

Speaker

I mean, I mostly used Hugging Face. okay So Hugging Face has some great documentation on models and what what how you know how they work, what they do, what their limitations are.

00:20:14

Speaker

that's the best type of documentation I've seen so far. I use a lot of ChatGPT. Like, I actually really love ChatGPT for educational purposes. um However, I also felt that it was, like, and hallucinated a lot.

00:20:27

Speaker

um So it was very confusing. Like, that's why, like, i mean, tinkering was eventually, like in this particular scenario, was, like, the best way for me because i was like, man, I don't know if I can trust ChatGPT and this documentation is, like, kind of crazy. I don't know what,

00:20:43

Speaker

you know, what, ah what soft max means right now or, or, you know, whatever training method or, or model architecture really means. So I'm just going to like start working with it and see what happens.

00:20:55

Speaker

And so that really was like just setting up a really nice environment so I could like iterate super fast, comfortably and painlessly. um And just like try things out was, you know, the best teacher for me.

Command Line Interfaces

00:21:09

Speaker

So, and I have to ask, cause I know you're, you know, you do a lot of mobile work and you said, I really want to develop a command line interface. Is this like just to get rid of all of the the garbledy gunk stuff and in the and in a phone and like, you know, just get down to something more raw?

00:21:29

Speaker

So the main reason was not really related to to mobile. I mean, it was just because I just like command line interfaces. It just looks super cool. I mean, I just love the way that they, like,

00:21:40

Speaker

the colors and and the text interfaces, text-based interfaces, I don't know. It was just visually appealing to me. So that was the main thing. But also there was some, it is kind of refreshing to use UI libraries, which are way lighter.

00:21:57

Speaker

I mean, obviously when you're when you're writing like stuff for Android, it's like, you know there's this huge framework. It has all the stuff and it's very opinionated into how you're supposed to do things, right? They have like an entire UI library just for, know, for Android and then they have the framework code and all that. And so with writing stuff on the command line interface, start writing stuff for command line UI.

00:22:21

Speaker

um It was lot more lightweight. I don't know. It was, it was interesting. It was also very imperative. Do you remember like Java swing? but yeah so Oh yeah. it was like, it's lot like that. And so it's kind of,

00:22:34

Speaker

painful in a way. Like I get why, you know, modern, uh, you know, user your interface libraries are, you know, reactive or whatever. Like, like they, like they're declarative.

00:22:45

Speaker

Declarative is the word. Um, declarative and instead of imperative. So you just kind of like write what you want on the screen and then it just, you know, does it for you. Right. you don't have to like hold a bunch of references and things like that.

00:22:56

Speaker

And so with this kind of thing, it was kind of like moving back to that previous state because they don't have anything like that. they It's just still like super imperative. Um, And I don't know, it was cool. Like, it was like, you know, it was just, it was kind of refreshing to do something different, mostly.

00:23:14

Speaker

Yeah, I was had done some stuff command line a while back. and i when I learned that they there's actually a term for a user interface on the command line, which that is a user interface. You're interfacing with a computer.

00:23:27

Speaker

They call them TUIs, textual user interface. t I thought that was just funny saying TUI. So anyway, TUI. Yeah, I had never known that before.

00:23:39

Speaker

Looking back, I mean, now now where is your solution today? What are you doing with it are you is it? Was that just kind of a little passion project on the side and you it's done now? I'm on to something else. or whats What's going on? So at that point when I realized that I needed to do some fine tuning in order to have it perform better, um kind of like left it at that and tried learning fine tuning, started learning fine tuning.

00:24:07

Speaker

um And my plan is to like, once I kind of like ah learn fine tuning and then kind of like learn some more of the underlying concepts behind, you know, how that thing is working, then I'll go back to it and and make some improvements with that new knowledge, right? Like maybe choose a model that works better for that particular use case based on, you know, um knowledge of of how the models work and what they're best used for depending on their architecture, things like that. And then maybe do some fine tuning specifically before that case. It was just like fine tuning for that case, like,

00:24:39

Speaker

it seems very difficult um because it's not it's not a trivial task. ah like ah Teaching and a model to understand like abstract meaning in lyrics or poetry, I think is...

00:24:54

Speaker

not trivial especially for like a beginner so um i went i think the trick is like is is kind of like what what they call like labeling the data so you have an input and then you have you want the model to to have an output uh based on that and so you have to have like if to be able to tell the model this is what i would want you to output so that it can learn like that supervised learning loop that's probably the hardest part is coming up with like all that labeled data for first things like that So like there's that question, right? It's like, okay, so what should my data look like? And so sure, I guess I could be like, you know, I could give it examples of songs. i could I could give it the input, you know, that I want like a text.

00:25:33

Speaker

But well, that's what the thing is that that's not what the thing is doing. Like I couldn't use as my examples, like input text, right? Prompt text or whatever. And the output, like the name of the song, because the model is not really doing that. It's just encoding meaning.

00:25:47

Speaker

Right. And so I would have to do something like, I don't know, summarization probably, like a summarization

Fine-Tuning AI Models

00:25:53

Speaker

task. Like I'd have to give it poetry lyrics and then have it explain it, like, you know, summarize it into more concrete terms or something, right? Maybe doing something like that would but kind of teach it, but it's not a super direct thing where it's like, oh, classify, right? And and so for a classification task, you know, like it's it's very easy to imagine what kind of examples you would need, right?

00:26:15

Speaker

So usually, you know, that's kind of like what you start learning with. So it wasn't clear to me exactly what kind of examples I should use to train the thing. And also, you know, it kind of became evident quickly that I would need to, you know, it would be kind of difficult to get that data in the first place, which is something that causes a lot of resistance.

00:26:33

Speaker

Uh, it caused a lot of resistance for me like get into fine tuning in the first place. Cause like I was kind like stalled for a while cause I was like man, I really want keep on working on this project. Um,

00:26:44

Speaker

And then i was like, man, okay, well, let me but let's try to lift like this kind of example. asked JGPT, like, oh, what kind of examples would you recommend for for training this model to do this? And I was like, okay, well, you can do this and this and that the other.

00:26:55

Speaker

But the examples were kind of like complex. and I was like, right, well, how do I get like 10,000 examples of this? um And then I was like, oh man, well, I guess I could use Mechanical Turk, Amazon Mechanical Turk. If you were going to do it over again, if you if you were going to take another run at this project, is there anything like, you know, based on now that you've done more and more research, like um going to start fresh. what what are you What are you grabbing for? What technologies, what tools, what like what are you what are you doing if you're going to take this on again?

00:27:23

Speaker

I've learned a little bit more about the models before choosing one of them, but i would do I would definitely do more preprocessing.

00:27:34

Speaker

So I did some, but I think I would do lot more. um think that might help. Let's see.

00:27:46

Speaker

and mean by that, what do you mean by like kind of tell me about the preprocessing? So there's some more, there's like some some tools out there. Like there's some libraries for natural language processing where you can like,

00:28:00

Speaker

it makes it really easy for you to like extract. To make it even easier without having to do fine-tuning, would probably have it like extract even more. So like maybe even just take the entire lyrics and then just extract like you know certain key words or nouns or adjectives or something, and then do some preprocessing there just to help the the model as much as I can since I'm using a really generic one.

00:28:25

Speaker

or I would use a more powerful model. So yeah because I really just went with like the first one. i i the first so like I looked at the two. I was like Distilbert Universal Sensors. Those are the two I looked at.

00:28:38

Speaker

So I would have probably taken a little bit more time to look at what other options there were and then and you can choose chose one a little bit more appropriate for the task. I did not regret using a command line interface, even though it took me little A long time to like get that running, but it was worth it. looked really cool. and and you can like I was able to like use like you know how when you when open some command line tools, it'll pop up like the logo and and when text art.

00:29:04

Speaker

like I was able to do that, so that was cool. It was worth it. And yeah, i mean, overall it was like a learning project. So like, I know that there's, there's tools out there which make a lot of this a lot easier.

00:29:17

Speaker

Like I think that it's called, uh, is it Langchain? Langchain. So like, uh, the other day someone else from Caliberty, um, was doing a, like a demo on something they built with Langchain. And I was like, man, she did everything that I did and like two lines using this library.

00:29:34

Speaker

And so, and so like, i was like, man, um, I might have used that like if I was actually trying to just get something out super quick. But part of a huge part of the purpose of this project was actually for me to learn.

00:29:46

Speaker

So i you know I think I would have just done the same. I was ah struggling with like, you you talked about how do you break that how do you break up the song? Like, do I do i do it by stanzas? And then do it should I repeat the chorus multiple times? Like there's there's a lot to think about So I was new on this forever. Like, how do I chunk?

Automating Text Chunking

00:30:04

Speaker

That's the verb, right? How do I chunk this thing up into into small chunks and and and make it so that it's not too big, not too small, you know, that kind of thing.

00:30:11

Speaker

And I'm like, wait a minute. I've got an LLM here that understands language. I just fed the whole darn thing to the LLM and said, break this into chunks. And it did it for me based based on its and its understanding of language and and and the nuance and whatnot.

00:30:27

Speaker

And it actually did a pretty darn good job. Now I ran into trouble when I had like really big documents and stuff, but I was able to work around that. But like, I thought that was a really cool, like, um Why am I messing around with this? Why don't i let that ah this thing understands language really, really well? Let it break this thing apart.

00:30:42

Speaker

And that that was like kind of a breakthrough for us. And well, for me, when we were working on this thing, like to so I didn't have to, you know, fiddle with it so much all the time. I I just throw it at the language model. didn't break it apart.

00:30:53

Speaker

And it actually it did a pretty good job. And then I also, what I told it to do was not just break the text apart, but like, it's kind of, you talked about like, get rid of the oohs and ahs and the, what we call stop words in natural language processing, get rid of all that stuff.

00:31:07

Speaker

So it streamlined that. And I said, basically optimize the chunks that you extract. for inserting into a vector database. So it knew to do all that, get rid of all the fluff, and then I could just, just the semantic stuff, the stuff that would actually ah encode and actually provide meaning, it left that. And it was actually really cool. It worked out really well.

00:31:26

Speaker

So I thought that was ah that was a lot of fun. And I also, and took it little step further. It's probably a little expensive to do it this way. When we have a query come in, like when the user types things, I send that to the language model. I say, this is the query they they gave me, the prompt they gave me.

00:31:44

Speaker

I need to use this to search a vector store. Can you basically rewrite this or or you know beef this up for me, so to speak, so that it's best suited for for searching a vector store? And I did a pretty good job of that too.

00:31:57

Speaker

So yeah, it was pretty cool. Yeah, like i and that's kind of like, I think this generally is kind of like one of the challenges with, well, probably one of the biggest challenges in that part of RAG, right? Or the part that I touched on and what you just mentioned, it was like chunking, right? How do you chunk this stuff, right? Like, you know, this complex input or whatever, text, like how do you chunk it so it has what I need it to, how to get so that they can we can encode or quantify what we to quantify or we can access it easily

00:32:28

Speaker

um certain like query through it easily, and things like that. So definitely chunking is is a challenge. and And also, you know i think that it's funny you mentioned that. um I think that also kind of, it reminded me of of something that i read a while back when I was getting into AI.

00:32:50

Speaker

I started to like this book and then the book just like overwhelmed me a bunch. But this is one of the things that i remember from it. And they were like, AI or machine learning style like programming

00:33:06

Speaker

or engineering whatever you want to call it is different from departs from traditional programming in that for traditional programming, you define the rules and the input.

00:33:18

Speaker

You so define the rules and then you provide input and then you'll get like the machine produces output, right? But with this kind of stuff, you give it examples like input output, like expected output, and then the machine will to figure out the rules for you.

00:33:34

Speaker

And so, like, I think there's a lot of challenges where, like, I've kind of, like, come across this, like, a bunch, right, where I'm, like, thinking about something or some kind of problem, or in your case, of like, how am I going to chunk this or whatever?

00:33:46

Speaker

And, um, sometimes there's like a lot of rules that we don't fully understand or it might just not be feasible for us to figure out, right? Like to write down all the rules for what like processing an imagery or determining what, like what a dog is like, Oh, if this pixel is in this position, like it's not a dog. If it's that, if it's, you know, and so there's this type of problems, which I feel is i just AI just solves the best. And it might just be the only way to solve a lot of these problems because we can just have it,

00:34:14

Speaker

feed it real life information or like a bunch of these examples and data and it'll come up with the rules for us. We'll do it in like a really obscure way. Like it's kind of like we don't even understand what rules it uses entirely. Right. But it'll come up with those rules when we, when we have. Yeah. I think it's, it's, in you, you, you bring up like the identify a cat.

00:34:33

Speaker

That's one of the big things that that like people teach with AI is like, the you know, image recognition, like trying to, trying to find a cat in a photo or something like that, or an apple and an orange, you know, those kinds of use cases. But like, the what's interesting. I think that's, that's a stepping stone. People kind of lose sight of like just getting it to recognize a cat that doesn't really enrich human beings lives, right? Like you and I, i if I see something, that's a cat, right? Like I'm, I'm pretty good at that. I don't need a computer to tell me how to find a cat.

00:35:02

Speaker

However, If you take it to the extreme and let it and turn it loose on bigger problems that we don't we don't have the perception to be able to understand, then we get really interesting findings. I forget the name of the antibiotic, but they found like ah an antibiotic that was um that was um it it could kill MRSA or whatever, right one of the one of the resistant strains of bacteria.

00:35:27

Speaker

We would never have, they have no idea how it figured it out. It saw connections that that the the model saw connections that we never would have picked up on, you know? So that's where you want to use, like have the, have the computer do the computer thing, right? Like, I don't need you to show me what a cat is. like I can do that.

00:35:43

Speaker

You go find all of the interconnections of between like chemicals and all that stuff. That's going to save people's lives. That's what the computer can be good at that we just can't do. We don't have the the ability to do that. All right, Ezreal, this has been fantastic. Up next, we're going to do the segment of the show that we call Ship It or Skip It.

00:36:02

Speaker

Ship or skip, ship or skip, everybody, we got to tell us if you ship or skip.

Future of Retrieval-Augmented Generation (RAG)

00:36:08

Speaker

Do you think RAG really has a future in in the generative AI world? Do you think it's, is it here to stay or is it is it is it just a passing trend?

00:36:15

Speaker

Like a week after I kind of started this, they're like, oh, now it has a million tokens in its context window. And I was like, man, that kind of like makes rag like obsolete. I mean, not entirely obsolete, but like you can just, you know you don't need rag in a lot of situations now that we have that. Assume, I mean, unless you like want to save on right money, right? I think it it it comes down to like a cost thing.

00:36:43

Speaker

um But as like, and you used to used to actually need rag. be able to do certain things because like the context window of the models was just limited.

00:36:54

Speaker

So you had to like somehow, know, you had to fetch some of this information, prefetch some of this information to kind of like feed it, you know, and part of, as part of the context. So yeah, I could like read from it instead of like having the, i know about this huge, you know, information base, which is way larger the context window. But now for most cases, you can just like plug it into the query.

00:37:15

Speaker

yeah like You can give it entire code base or a huge document. and then you know just like We could probably fit our entire Caliberty wiki in a million context tokens right and then just habit you know give it it prop the prompt right after that and have it do that. so its I don't think it's needed anymore to enable you know a model to to to do those kinds of tasks like ah augment being augmented with information, but it makes it cheap.

00:37:44

Speaker

uh, cheaper for now. Um, so yeah, I think that, uh, I would not invest too much time in and learning rag at the moment beyond like the fundamentals, because I feel like, um,

00:38:05

Speaker

Yeah, it's it's it's going to change probably moving forward. And even if it doesn't change, like it's going to be super abstracted some point. like with laing right Like, oh, now it's like two lines that you can can just do the same thing. and write So it's I feel like it's not... I think it's good to know, but I think industry is changing too quickly for me to be like, oh, i'm goingnna I'm going to become a rag specialist.

00:38:30

Speaker

Okay, so you're skip it-ish.

00:38:36

Speaker

Skip it-ish. think that's a technical term. Skip it-ish. I'm with you on that. I think the the context window thing is is becoming you know it's making like RAG somewhat obsolete. i I still think it's probably going to have its place, like as you said, as a cost-saving mechanism. right Now, you know I'm old enough to remember back in the day when it was like,

00:38:59

Speaker

Hey, I only have so many minutes on my cell phone kind of thing. Right. And like, and that's how we are right now with these generative models, right. You're having to pay by the token and you know, you so it just you just every single request, but now it's like, i don't, I don't pay for extra internet. I don't pay for extra time on my cell phone. So if,

00:39:16

Speaker

this becomes more of kind of like ah utility, so to speak, like, like those things have where you're, you're not doing that pay as you go to kind of model. Absolutely. You know like i think you said, like we could upload our whole wiki. I mean, like I was thinking like our, um or our handbook, right? That's usually, that's one of the big rag and like use cases. Everybody says like, Oh, you can take your company handbook and break it apart. But let's face it. Our handbook's not, I mean, it's not like, you know, Tolstoy, right? Like we can, we can probably manage to, to send it up there and be like, Hey, here you go.

00:39:48

Speaker

And just read this whole thing and tell me how, you know, how many weeks do I get for paternity leave? And it would spit it out. Right. So, yeah, i'm I'm with you on that. I think it will it will get there, but I do think that we're we're going to be with RAG for a little while.

00:40:03

Speaker

All right. So you have a vast experience building user interfaces for users. One of the trends I'm seeing or hearing about is is kind of this, and it's AI related, where instead of like,

00:40:19

Speaker

giving the user kind of all these widgets and controls and everything for them to go do the things that they want to do.

AI-Driven Interfaces for User Interaction

00:40:26

Speaker

a lot of folks are kind of exploring this idea of like the interface is very simple, almost like Google, right? So you just type in, um i want to I want to pay my electrical bill.

00:40:37

Speaker

And you just you just type that in. And instead of having to like navigate through a bunch of stuff and widgets, the, the AI just figures out like, Oh, I see you want to do this. Let me go do this thing. what do you think about that concept as a, as a human computer interaction kind of thing?

00:40:53

Speaker

Um, I think it's really cool. i mean, I'm ah I'm a fan. ah I'm a fan. i mean, I've, I think notion, I use notion a lot and like notions kind cool in that way. Like I think it's integrated with a lot of that.

00:41:06

Speaker

Um, and actually like I recently also, bought, I bought a printer. It's like an Amazon printer. And it came with an app. And apps usually are. They're usually terrible. I hate printer apps.

00:41:23

Speaker

But it actually, like, so it had the you the normal UI, right? It has, like, the traditional UI. But it also had, like, when i was setting it up, it actually opened a chat interface. And I was like, it was really surprising to me that out of all applications, like printer apps would be one of the first that I would see like integrate this stuff, you know?

00:41:43

Speaker

So it was like a chat. It was like a chat bot. And um like, it was like, oh, well, so, you know, do you want to set up the printer whatever? And like, yeah. And then it'll give me the options.

00:41:55

Speaker

And then. it would like it would all be this textual interface. um And they're like, all right, so choosing this list, you know, what ah what what ah what what the printer is. I'll go, bam. Like, okay, um I'm going to set this up for you now, whatever.

00:42:08

Speaker

treesze Please press the button on the printer. And then I'll do it and I'll be like, yeah, i just did that. And they'll like, okay, doing this now. It's cool. so the whole setup was text-based. It was like a chat. And i found I thought it was... It actually worked really smoothly.

00:42:21

Speaker

Like, I really liked it, and it worked like it worked well. So i um So far, the use cases that I've seen for that, they work well. And and I think that's one of AI's strengths, right? Just like providing that that

00:42:39

Speaker

that, like a textual interface that feels very natural. And I mean, I you know i i use ChatGPT a lot. So like I have full conversations with it. And when I'm coding, I can tell it but kind of what to do.

00:42:53

Speaker

And most of the time it's pretty accurate in interpreting what I want it to do. Even if I'm like super vague, it'll be like, oh, I want i know you were trying to do this. Right. So I feel like that's something that overall models can do really well. It's like ah and ah interpret your intent.

00:43:07

Speaker

Like I've never had issues with that. And so I think that from, you know, user interface perspective, like most of the time, that's what you're trying to figure out, right, is what is the user trying to do? What do they want to do? Like click this button in to do this.

00:43:19

Speaker

Right. So I think that that's a good use case for it. Obviously, it's not going to eliminate the other principles that we already have. i mean, you're always going to need some kind of um you know more visual button-ish, button-y user interface is um because, first of all, it's probably way cheaper to do that. I mean, it's probably like, I don't know...

00:43:42

Speaker

how how much more expensive would you think is like, you know, triggering a user action to via text, you know, we a model versus like just pressing a button. At least, yeah, I'm sure. It's pretty pricey. I don't know.

00:43:54

Speaker

At least, yeah. So, um you know, and there's just stuff that's always going to be simple, right? You know, logout log out, log in. So I think it's always definitely going to be some kind of hybrid where you have like your base interface and then for certain functionalities,

00:44:11

Speaker

you'll like search stuff. I think it's definitely going to impact search UI design for sure. Like 100% search UI design ah support, you know, kind of stuff, that kind of stuff is definitely going to integrated. So yeah, definitely. yeah I do like the idea too. I'm kind of a minimalist when it comes to like interfaces and stuff. And I like that.

00:44:31

Speaker

I like the idea of the the purity of it, so to speak. Right. But I know my experience is like, i don't know if you've interacted with some of these chat bots where you're like, I need to do And you're like, oh, I understand you need to do Z. Here's the the options for how to do Z. Hold on a sec. Wait, I said X. ah Oh, Y is very ah convenient right now. Have you tried it on our website?

00:44:53

Speaker

Like what are

00:44:57

Speaker

And the last segment of our show that we do, it's really

Rapid-Fire Personal Q&A

00:45:01

Speaker

important. Lots of, you know, right or wrong answer questions. So it's it's it's high stakes. So you you're going to need to, you're going to want to be on your game for this. Okay. You're ready to go. All right, let's let's do we're doing, the lightning round.

00:45:36

Speaker

Godfather or Star Wars? Star Wars.

00:45:42

Speaker

um What's your ideal outside temperature? 26 degrees Celsius. I have no idea what that means.

00:45:53

Speaker

26 degrees Celsius. All right. Okay. I got to in America. No. ah would you eat a day old taquito from seven eleven no Polka dots or stripes?

00:46:12

Speaker

Stripes. Have you ever been to Africa? No. Is double dipping at a party ever acceptable?

00:46:21

Speaker

No. And finally, what is the place you most want to travel? ah Argentina.

00:46:31

Speaker

Yeah, Patagonia. All right, Asriel, well, that wraps up ah this episode. Thanks again for coming on. we really I really enjoyed the the chat and learned some cool stuff, and I'm going to have to go out and try out how if I can figure out what song is the best song to ah for this moment in time. I don't i don't know. if i'm I'm pretty good with movies. Songs? I don't know. Let's try that out. But, yeah, again, man, thank you so much. This was great.

00:46:57

Speaker

Yeah, sure sure. Thanks for having me, and I'll i ill I'll definitely reach out once I have my app finished. Well, thanks everybody for joining us today. If you'd like to get in touch, drop us a line at the forward slash at Caliberty.com.

00:47:11

Speaker

See you next time. The forward slash podcast is created by Caliberty. Our director is Dylan Quartz, producer Ryan Wilson, with editing by John Corey and Jeremy Brown. Marketing support comes from Taylor Blessing. I'm your host, James Carman, and thank you for listening.