Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
47 - Our New NLP Overlords image

47 - Our New NLP Overlords

EXIT Podcast
Avatar
2.4k Plays2 years ago

EXIT member and shaolin.ai founder Zach Martin and I discuss what the machines have planned for us in 2023.

  • What creatives and wordcels need to learn to make money with AI
  • How a large language model like ChatGPT differs from true AI
  • How NLP opens up new frontiers for machine learning & surveillance
  • Why ChatGPT probably isn't the Singularity
  • Launching a business with EXIT
Transcript

Zach Martin's Journey into NLP

00:00:17
Speaker
Hey everybody, welcome to the Exit Podcast.
00:00:18
Speaker
This is Dr. Bennett, joined here by Zach Martin.
00:00:21
Speaker
He's an Exit member and an expert in natural language processing, which is in the news these days as it's gonna consume and destroy all human endeavors this year.
00:00:35
Speaker
So I wanted to have an expert to come tell us how afraid to be, what kind of obeisance to prepare for our new NLP overlords.
00:00:44
Speaker
So welcome to the show, Zach.
00:00:46
Speaker
Yeah, thanks for having me.
00:00:46
Speaker
We'll see who has a job after all this shakes out.
00:00:52
Speaker
So tell us a little bit about your background and how you got into natural language processing.
00:01:01
Speaker
Yeah, yeah.
00:01:02
Speaker
So first of all, I kind of have a non-traditional background into this.
00:01:08
Speaker
So most guys in my world come from some type of statistical background, mathematics or computer science, essentially.
00:01:17
Speaker
So I actually didn't know what degree I was going to get in college and started just taking classes I was interested in.
00:01:26
Speaker
So I started taking a bunch of linguistics classes.
00:01:29
Speaker
And then I got married in college and I was like, oh, no, I actually have to make money now, right?
00:01:38
Speaker
So I was like halfway through a linguistics degree and I'm like, well, what am I going to do with that?
00:01:44
Speaker
I'm like, I guess I go to law school or something like that.
00:01:47
Speaker
That's what a lot of people in that situation do.
00:01:51
Speaker
But then I started talking to the professors and they were like, hey, NLP is the space to be.
00:01:57
Speaker
That's where you make money.
00:01:59
Speaker
And I'm like, OK, what's NLP?
00:02:01
Speaker
And so essentially what I learned is it's where.
00:02:04
Speaker
kind of computers and human language intersect.
00:02:07
Speaker
So it's natural language processing as opposed to computer language, which would be like your Java, your C, your Python, those types

NLP Tools and Projects

00:02:16
Speaker
of things are all computer languages.
00:02:18
Speaker
Natural languages being French, English, Spanish, et cetera.
00:02:22
Speaker
Right.
00:02:23
Speaker
And so being able to process natural human languages via computers was kind of the, the,
00:02:31
Speaker
the idea there.
00:02:32
Speaker
So I kind of picked up coding somewhat on my own.
00:02:36
Speaker
I took a couple classes in college, but but kind of bootstrapped my own kind of coding background and and paired with kind of my linguistic knowledge.
00:02:45
Speaker
I ended up being the only one from my graduate cohort of linguistics bachelor's degrees to have a job lined up in the industry.
00:02:56
Speaker
before graduation.
00:02:58
Speaker
So there's a lot of, it seems like there's a lot of jobs like that where, where it's really, it's actually really valuable to be like a, a graphic design professional or, or a creative.
00:03:12
Speaker
If you can also talk to computers, like I know a lot of front end guys who didn't start as software developers.
00:03:18
Speaker
They started as essentially artists, creatives, and, and were able to,
00:03:25
Speaker
Basically, they had to go learn to code so that they could do the kind of work that exists in those fields, which is all done with computers.
00:03:32
Speaker
So it's an interesting... But I think we're sort of entering this realm, or maybe we are, maybe we're not.
00:03:39
Speaker
I want to ask you about this.
00:03:41
Speaker
Are we entering a realm with these chatbots and natural language processing where you actually don't need to be able to code to do some of this type's work?
00:03:54
Speaker
Well, what I'll say is 100%, you don't need to be, like, you don't need to be the guy that, you know, can write binary code and, like, make computers work.
00:04:09
Speaker
Like, you don't need to be a hardcore computer science guy to actually, like,
00:04:14
Speaker
be successful in kind of the coding world anymore because part of it is not even due to these ai things but just the fact that there's so many libraries and open source sharing of code that like you don't really need to know how to make a neural net to be able to run one on some data if that makes sense and and now you don't even have to know how to code it necessarily although there is still some
00:04:41
Speaker
gap between the actual like implementation of some of these ideas, like actually getting them to function on their own versus kind of, Hey, look, I did this one thing once and it kind of worked.
00:04:53
Speaker
So, yeah.
00:04:54
Speaker
Yeah.
00:04:55
Speaker
So I was a, I was a data scientist, you know, so-called and, um, basically, so I, I, my, my capstone project for my, um,
00:05:07
Speaker
bootcamp that I was part of, I did an NLP analysis of Game of Thrones.
00:05:12
Speaker
I took all of the script from Game of Thrones, everything, every word uttered on the show, and I ran it through essentially a black box in Python.
00:05:24
Speaker
I had no idea what that black box was actually doing.
00:05:28
Speaker
but it outputs some information about... I was basically trying to figure out, could I predict, based on the words people used, whether or not they were a common or a noble?
00:05:37
Speaker
So I had to go through and tag the data.
00:05:39
Speaker
This guy's noble, this guy's common.
00:05:42
Speaker
And I am definitely not a computer science guy.
00:05:47
Speaker
I'm not going to learn binary.
00:05:49
Speaker
I'm not going to learn even really all that much Python.
00:05:54
Speaker
But because these libraries existed, I was able to... Basically, you have to know how to...
00:05:58
Speaker
put in the input and interpret the output and you can get a lot done.
00:06:03
Speaker
Yeah, yeah, really it's just a matter of kind of knowing how to set up the data to go into the black box and then how to interpret it once it comes out.
00:06:13
Speaker
Yeah, so.
00:06:15
Speaker
So, yeah, I mean, that's that's what a lot of guys are doing in the data science space.
00:06:20
Speaker
Like there's definitely a need for those those guys with that higher level knowledge.
00:06:24
Speaker
And they're definitely making, you know, the huge, huge bucks and and doing research and that type of stuff.
00:06:30
Speaker
But if you're not a Ph.D. researcher or something like that, like there's a lot of stuff you can do.

NLP in Business Applications

00:06:36
Speaker
mainly just because a lot of companies are kind of stuck in the Stone Age still.
00:06:40
Speaker
So a lot of this AI stuff is completely new.
00:06:45
Speaker
But yeah, speaking of capstone projects, my final project for my final NLP class was I used to read news articles, and I would try to skim them really quickly to find out whether or not the news site was left-leaning or right-leaning in their bias.
00:07:05
Speaker
Because a lot of times you'll be like, ah, this new site sucks.
00:07:10
Speaker
Or it's no good.
00:07:12
Speaker
And so I would kind of skim it.
00:07:13
Speaker
And so I built an NLP algorithm that looked for certain keywords and weighted them based off of whether or not they appeared.
00:07:22
Speaker
And
00:07:25
Speaker
kind of mapped out, I scraped a bunch of news sites and then, uh, mapped whether they were left biased or right biased.
00:07:32
Speaker
And so you proved like, uh, any fake news stuff and, and stuff like that.
00:07:39
Speaker
So you, you proved conclusively to your professors that, uh, that the, the left wing media was out to get us and, uh, so much for the tolerant left, et cetera.
00:07:48
Speaker
Yeah, exactly.
00:07:49
Speaker
Exactly.
00:07:53
Speaker
Um, but yeah, um,
00:07:56
Speaker
A little more on the timeline of just my background that's somewhat relevant is... So, like, the first job I got was for a company that did surveys.
00:08:07
Speaker
So, like, if you ever get a receipt from McDonald's that says take a survey and win a free McDouble or whatever it is, right?
00:08:13
Speaker
Or some, you know, hotels do it and airlines do it.
00:08:17
Speaker
So, this company...
00:08:20
Speaker
did surveys and I was building the NLP models that would flag what customers are talking about in the plain text.
00:08:30
Speaker
So like, are they talking about the hamburger?
00:08:32
Speaker
Are they talking about whether or not it tasted good or bad?
00:08:35
Speaker
Right.
00:08:36
Speaker
And then so, so pretty rudimentary stuff, but I was using IBM Watson, which was like an early like AI tool.
00:08:47
Speaker
And, um,
00:08:48
Speaker
But yeah, subsequently, I've done a lot more in the open source and Python stuff.
00:08:52
Speaker
But that's kind of where I got my start.
00:08:56
Speaker
And a lot of the stuff that I was doing back in the day is kind of being revolutionized by some of the newer developments in NLP, specifically transformer models or large language models, which are like chat GPT or GPT-4 that people have been hearing about.
00:09:16
Speaker
Yeah, there's a lot of cool stuff that's been done.
00:09:18
Speaker
I mean, it's, and it's been going on for a really long time.
00:09:21
Speaker
My two favorite, cause I'm like a, I definitely belonged in my MBA program.
00:09:26
Speaker
Like I'm, I'm like a business major kind of a guy at least like that's, if I have to, if you have to put me in a marketable place, that's where I'm marketable.
00:09:36
Speaker
I'm definitely not a technical guy, but what attracted me to data science was ultimately two stories of,
00:09:43
Speaker
One of them was Walmart.
00:09:47
Speaker
They were selling this bag of dog food and it wasn't moving.
00:09:54
Speaker
And they were doing all these analyses on all their different inventory, trying to figure out why.
00:09:58
Speaker
And it was like better quality than the name brand.
00:10:02
Speaker
They were trying to sell their like Equate brand dog food, right?
00:10:06
Speaker
And it was better quality.
00:10:07
Speaker
It was packaged to the right size and it was...
00:10:13
Speaker
And it was selling only on the weekends.
00:10:18
Speaker
And it was like radically underperforming during the week.
00:10:22
Speaker
And they were able through some data models that they built of customer feedback to figure out that customers were only purchasing it when the man was at Walmart.
00:10:37
Speaker
So it had to be either a man or a couple.
00:10:40
Speaker
And they were like, oh, it's because she can't lift the bag.
00:10:44
Speaker
And so that's why it wasn't selling.
00:10:47
Speaker
So they downsized the bag and then it moved.
00:10:49
Speaker
And they made a ton of money on this dog food.
00:10:52
Speaker
And then another story is Hera's, the casino.
00:11:01
Speaker
They hired this CEO who was like a, I think he was like a stats professor.
00:11:05
Speaker
He was some academic in stats and data science.
00:11:10
Speaker
And he basically just started hoovering up all of their user data that they could possibly collect.
00:11:17
Speaker
And they were viewing these reward programs that they were using as like a way to incentivize
00:11:27
Speaker
Like it was like a discount program.
00:11:29
Speaker
Like we're going to make money because we're going to offer these discounts and we're going to retain customers, et cetera, et cetera.
00:11:33
Speaker
And he was like, no, the purpose of your rewards program is to collect user data, which that's like not even, that's like a cliche now.
00:11:41
Speaker
Everybody knows that.
00:11:42
Speaker
But at the time it was kind of revolutionary.
00:11:44
Speaker
And so Harris comes out with this ad campaign where they've like, they're like, they've got the luggage carts at the hotel and they're wheeling out like a giraffe.
00:11:54
Speaker
And the whole point of this ad campaign was like, do you want a giraffe?
00:11:59
Speaker
Do you want hookers and blow?
00:12:00
Speaker
Like you tell us what you want and we'll just make it happen because we've got this personalized, individualized model of who you are and what you like.
00:12:08
Speaker
And we're going to learn everything that you love and we're going to give it exactly to you.
00:12:12
Speaker
And, uh, and that actually took them from being like, this also ran like on the skids outside the strip to like one of the most successful, uh, chains in, in the city.

Ethical Concerns in Data Science

00:12:22
Speaker
So, uh, there's cool stuff that happens with, uh, with data science in general and NLP opens up to a lot of this like fuzzy, squishy sentiment type data.
00:12:36
Speaker
Yeah.
00:12:36
Speaker
Yeah.
00:12:37
Speaker
And, and that, that kind of takes me back to my experience in, uh,
00:12:42
Speaker
customer experience is what it's called that field of like surveys and stuff.
00:12:46
Speaker
So like previously, like, and, and still some companies do it where they try to give you like a hundred different questions that you rate from one to 10 and like everybody hates those.
00:12:55
Speaker
And, and also they're not very useful.
00:12:59
Speaker
So like the philosophy of one of the companies I was at was just like, how about they just tell us what they want to say.
00:13:05
Speaker
Right.
00:13:06
Speaker
And so, um, so like one of them was like Chick-fil-A.
00:13:11
Speaker
They, uh,
00:13:12
Speaker
We did some study for them and found out that one of their biggest complaints was the blue cheese crumbles in one of their salads didn't come in a separate bag.
00:13:23
Speaker
So they had like almonds in a separate bag and apple slices in a separate bag or whatever for their salad.
00:13:30
Speaker
But the blue cheese crumbles they would just throw on top and then a bunch of people didn't like the blue cheese but like couldn't remove it.
00:13:37
Speaker
And so we told them that.
00:13:38
Speaker
And how are you going to collect that from a one through five?
00:13:43
Speaker
Yeah, one through five.
00:13:43
Speaker
How was your service?
00:13:44
Speaker
Are you going to recommend them again?
00:13:46
Speaker
Yeah.
00:13:47
Speaker
So sometimes it's all about just tell us what you want.
00:13:52
Speaker
Tell us what you're thinking.
00:13:57
Speaker
But yeah, a lot of fun times in NLP, including at one point for Little Caesars.
00:14:05
Speaker
we had to come up with all the different types of things that could be on top of a pizza that shouldn't be there.
00:14:13
Speaker
So we were like, all right, Band-Aids, you know, I don't know, needles, you know, what else?
00:14:19
Speaker
And so like that was my task because it was when I was kind of new at the company and they were like, just write down all the most horrible things you can think of that could happen at a Little Caesars.
00:14:29
Speaker
So I was just typing every swear word I knew, every disgusting thing I could think of.
00:14:35
Speaker
And yeah, that's what a job.
00:14:37
Speaker
Yeah.
00:14:38
Speaker
It was a fun job, but yeah.
00:14:44
Speaker
And like on the, on the, on the micro level, when you're like looking at it right up close, it's like,
00:14:51
Speaker
Oh, it's great that Chick-fil-A is going to take the blue cheese out of my, and it's great that my wife can lift the dog food bag.
00:14:57
Speaker
And it's great that, you know, maybe Harrah's knows that I'm LDS, so they're not going to be giving me discounts on alcohol.
00:15:03
Speaker
They're going to give me discounts on meals or whatever it is.
00:15:06
Speaker
Like, close up, you can see this, like, very uncontroversial, uncomplicated, good reason to do this.
00:15:16
Speaker
And then you pan back and it's like Lovecraftian existential horror.
00:15:21
Speaker
Well, you know, they know everything about you is companies like through your like cookies and your ad persona.
00:15:30
Speaker
They can tell pretty well whether or not you're pregnant.
00:15:34
Speaker
Like, yeah, of your your your history, but like some people might want that private.
00:15:41
Speaker
Right.
00:15:41
Speaker
So or like they can tell if you've gotten you're going to get divorced.
00:15:45
Speaker
Stuff like that.
00:15:47
Speaker
Like some companies, I heard it was like some bank was like, hey, are you thinking of getting divorced?
00:15:54
Speaker
Here's private accounts.
00:15:57
Speaker
And they think we're helping this customer by knowing their persona.
00:16:00
Speaker
But like, yeah, there's definitely like a dark side to it.
00:16:04
Speaker
And I mean, data privacy is something people care about a little bit

Impact of AI on Industries and Jobs

00:16:09
Speaker
now.
00:16:09
Speaker
But like, yeah, for a while there, nobody was even thinking about it.
00:16:13
Speaker
Well, there's even like info hazards.
00:16:15
Speaker
Like I don't necessarily want to know that I have, you know, three years to live or I don't necessarily want to know that like or like, you know, maybe there's maybe there's a way to tell me that I'm about to get divorced that that that helps me.
00:16:30
Speaker
But that way probably isn't like, hey, let's facilitate the transaction of your divorce as quickly and directly as possible.
00:16:39
Speaker
Well, and what's funny is talking to some of these like marketing people, they, they can't even comprehend that like a customer might not want you to know certain things like, like the, the creep factor, they don't even think of it.
00:16:53
Speaker
Yeah.
00:16:54
Speaker
Yeah.
00:16:55
Speaker
And I mean, you know, obviously on Google there's, there's like the question of porn, but like, there's just, there's so many other dimensions to that, like privacy question and like,
00:17:06
Speaker
Yeah.
00:17:07
Speaker
Well, and like, you know, a few years ago, well, like people are always like, oh, I talked about this thing and it showed up in my Facebook feed or whatever.
00:17:15
Speaker
And it's like, yeah, they said they were doing that.
00:17:17
Speaker
They were turning on the microphone.
00:17:19
Speaker
If you download the Facebook app on your phone, they have permission to your microphone and they are listening and using it for ads.
00:17:27
Speaker
So like, yeah, it's not fake.
00:17:29
Speaker
They said they were doing it, you know.
00:17:31
Speaker
And like you, you have to like, and it's an interesting situation because on the one hand, like ideologically, my, my opinion is basically that there's not a whole lot of conspiracy.
00:17:42
Speaker
There's a lot of like essentially automated human behavior going on.
00:17:47
Speaker
Like just, it's just people responding to pings.
00:17:53
Speaker
The, the,
00:17:55
Speaker
The analyst gets an email and that's why they have to build the thing.
00:17:59
Speaker
And the person who sent the email, they heard from the VP and the VP heard it from the CEO and CEO heard it from the shareholders and the shareholders are this like big blob of cognition that doesn't have like a human will attached to it.
00:18:13
Speaker
And so like you're sort of on the one hand, you're counting on these systems to be kind of dumb and maybe they are like Facebook.
00:18:24
Speaker
Does Facebook have like a nefarious big picture like doomsday scenario that they're trying to instantiate?
00:18:32
Speaker
Sell you those Chinese leggings or something?
00:18:34
Speaker
Yeah, or are they just trying to sell you leggings?
00:18:35
Speaker
Yeah, like exactly.
00:18:36
Speaker
And so far...
00:18:40
Speaker
It seems like the answer is basically that there is no conspiracy and it's, but it's at the same time in the service of selling you leggings or selling you, uh, a hotel room at a casino or selling you salads at Chick-fil-A.
00:18:57
Speaker
They're building this edifice that is, is just so incredibly dangerous.
00:19:04
Speaker
Yeah.
00:19:05
Speaker
Yeah.
00:19:06
Speaker
And, and like, uh,
00:19:08
Speaker
I mean, there's definitely some nefarious parts of it.
00:19:11
Speaker
Like they tested just showing people the like today is the day to vote.
00:19:17
Speaker
Like just putting that on the top of Facebook, like changed voting patterns by like a substantial amount, like election change.
00:19:25
Speaker
Yeah.
00:19:26
Speaker
Just being like, and which, which voters do you send that to?
00:19:29
Speaker
You know?
00:19:30
Speaker
Right.
00:19:30
Speaker
And, and we found out like they obviously had a bias after like 2016 and stuff.
00:19:36
Speaker
Yeah.
00:19:37
Speaker
Yeah, man.
00:19:39
Speaker
It's crazy.
00:19:40
Speaker
Well, so what do you think about kind of the sort of like... Are there like buggy whip industries right now that are just going to be eaten by this technology?
00:19:55
Speaker
So like talking about like chat GPT and stuff like that?
00:19:59
Speaker
Or just AI in general?
00:20:02
Speaker
Well, maybe let's start with AI in general and then specifically chat GPT.
00:20:06
Speaker
Yeah, so I mean...
00:20:08
Speaker
Honestly, I think it's stuff that was already being outsourced and automated that's going to continue disappearing.
00:20:16
Speaker
So they're replacing Indians and Filipinos, you mean?
00:20:19
Speaker
Yeah.
00:20:20
Speaker
Yeah.
00:20:22
Speaker
So basically that's it.
00:20:24
Speaker
Like call centers are going to be, you know, it's at a certain point.
00:20:29
Speaker
that AI bot that's like, tell me how I can help you, is going to be much more helpful than going through someone who barely speaks English.
00:20:39
Speaker
It's already close.
00:20:40
Speaker
Yeah.
00:20:41
Speaker
It's already close.
00:20:42
Speaker
It's going to get even better, right?
00:20:44
Speaker
And cheaper for companies to do that.
00:20:47
Speaker
So, I mean, because essentially all the, like any tier one tech support, right, is just like, we're just here to press the button that routes you to the right place.
00:20:58
Speaker
So like anything that's like that is is essentially going away.
00:21:06
Speaker
There's definitely like jobs.
00:21:08
Speaker
And I mean, those who have worked in in kind of the tech side of things, there's certain jobs that are like.
00:21:18
Speaker
Like nobody wants to do and those types of things are going to get automated away.
00:21:24
Speaker
So like I honestly think like.
00:21:28
Speaker
more of your like data wrangling where like your job is just to like say, Oh, we need to get this table and that table and put them together.
00:21:37
Speaker
Like that is not probably the safest place to be in, in the coding space.
00:21:42
Speaker
Right.
00:21:43
Speaker
And nobody loves to be there anyway.
00:21:45
Speaker
Although, although, um, having worked in like regulated industries and, and stuff like that, there, there's going to be regulations to come out around AI stuff.
00:21:56
Speaker
And also like,
00:21:58
Speaker
You know, I've worked at like banks and stuff and you can't just be like, oh, well, we missed a record on accident.
00:22:04
Speaker
Right.
00:22:05
Speaker
So like to a certain extent, there's always going to be kind of that manual backup check, even if it's automated with AI.
00:22:14
Speaker
So to a certain extent, there may be like a resurgent of like, you're just the guy who checks the AI and makes sure it does a good job, you know?
00:22:23
Speaker
Yeah.
00:22:24
Speaker
And I mean, like in in in defense, if you have a security clearance, I mean, they're never throwing those algorithms over the wall like that's just never.
00:22:33
Speaker
I mean, they won't even half those half those shops won't even use Tableau like it's got to be like Excel and it's got to be like Excel 1997, not Excel 2003.
00:22:44
Speaker
No, 100 percent.
00:22:46
Speaker
Like like definitely government is behind like pharmaceutical industry is still using stuff from the 70s.
00:22:53
Speaker
Like, and that's, that's the thing is like, if you are interested in getting into coding, I don't think you have to worry about it.
00:23:00
Speaker
Like within the next five years by any means, and probably not within the next like 25 years, because there's, there's so much out there and there's so many different businesses and, and really the only companies that are really making use of these AI algorithms.
00:23:16
Speaker
And, and a lot of them are not even doing a good job of it.
00:23:20
Speaker
Um,
00:23:21
Speaker
are like your Silicon Valley tech companies.
00:23:23
Speaker
So like, yeah, you know, they're way past kind of the baseline of coding and stuff like that.
00:23:31
Speaker
But there's so many big companies, old companies, and even startups that just like need, you know, we just need a model that predicts fraud.
00:23:41
Speaker
That's it.
00:23:41
Speaker
Like, so if you can do that, like you're way ahead of a lot of other people.
00:23:50
Speaker
And so I take it you're not like a Kurzweil guy.
00:23:52
Speaker
You're not like this is going to eat everything.
00:23:55
Speaker
It's going to start accelerating its own development.
00:23:59
Speaker
I don't.
00:24:00
Speaker
I'm like, I don't care about like, you know, are we just part of an AI?
00:24:04
Speaker
And, you know, I don't really think about that type of stuff that often.
00:24:10
Speaker
So like, you know, maybe.
00:24:13
Speaker
But yeah.
00:24:14
Speaker
just kind of being behind the scenes of this stuff and like knowing how like chat GPT works and stuff.
00:24:20
Speaker
It's so not scary.
00:24:22
Speaker
It's very like lame in how it works.
00:24:26
Speaker
And so we've got a lot, like a long way to go, like to get any of this to the point where it's like, you know,
00:24:34
Speaker
going to really disrupt the entire economy.
00:24:37
Speaker
Now, that being said, it's like, it's like, cause they'll talk about like, oh, it's, it's indistinguishable from like a New York times journalist.
00:24:46
Speaker
And I'm like, yeah, that's because those people aren't human.
00:24:49
Speaker
They're robots.
00:24:50
Speaker
Right.
00:24:51
Speaker
Right.
00:24:51
Speaker
So there's nothing about actual like human creativity.
00:24:53
Speaker
Well, well, yeah.
00:24:54
Speaker
And it's mimicking.
00:24:55
Speaker
Right.
00:24:56
Speaker
So, so it's, it's been trained on, on a training set that includes all of the New York times.
00:25:01
Speaker
So it can sound like a New York times person pretty well.
00:25:04
Speaker
But yeah, you're not getting any original thought out of it.

Evolution of NLP and AI Models

00:25:08
Speaker
But what it will replace is BuzzFeed listicles.
00:25:13
Speaker
You don't need to pay someone 45 grand a year to write those stupid headlines or whatever.
00:25:23
Speaker
So stuff like that, yeah, maybe you need to worry about your job.
00:25:28
Speaker
But I think for a lot of things, we're way far away from that.
00:25:34
Speaker
There was some guy who, did you see this guy who started a business with ChatGPT?
00:25:39
Speaker
I've seen a lot of those guys saying that.
00:25:41
Speaker
Oh, okay.
00:25:42
Speaker
There's a thread.
00:25:42
Speaker
Which one?
00:25:43
Speaker
Well, and I mean, to some extent, he's relying on the virality of the story.
00:25:48
Speaker
So it's like, you know, it's not really legit.
00:25:51
Speaker
But like, basically, he asked ChatGPT, like, hey, what's the best way to turn this $100 into an effective business?
00:25:59
Speaker
And ChatGPT says, you should start a business in green gadgets, green like household gadgets.
00:26:09
Speaker
That's a really good market.
00:26:10
Speaker
You should pick that.
00:26:11
Speaker
So he asks it like, what should my domain name be?
00:26:15
Speaker
And it's like, well, here's a couple of suggestions.
00:26:16
Speaker
He picks one.
00:26:17
Speaker
He goes and buys like green gadget guru.
00:26:20
Speaker
And he puts it on there and he starts just posting like affiliate links to Google.
00:26:25
Speaker
to like Amazon for thematic gadgets.
00:26:31
Speaker
And part of the reason that he turned it into like $1,500 in like a week.
00:26:37
Speaker
So a hundred bucks to 1500 bucks, that's pretty good.
00:26:41
Speaker
But to some extent, like I'm sure that he was driving traffic by saying, look at this neat application of AI.
00:26:49
Speaker
And that's, you know, and then Google sees that it's getting traffic and it sends it more traffic.
00:26:53
Speaker
And so there's, it's, it's a little bit fake, but I, it makes me think about like, man, if we're, if we're no longer limited by the number of like content monkeys that
00:27:06
Speaker
And we can just like just endless trash, like the amount of just, just horrendous, useless internet there's going to be.
00:27:15
Speaker
Can't even imagine.
00:27:17
Speaker
Yeah.
00:27:18
Speaker
It's definitely going to increase the amount.
00:27:20
Speaker
Like whenever, like I was looking to buy like a vacuum cleaner and you're like this vacuum cleaner review, you know, try to check a YouTube review.
00:27:28
Speaker
They're all AI generated now.
00:27:31
Speaker
Like all the top 10 results.
00:27:33
Speaker
I'm like, I just want to see someone using the vacuum cleaner.
00:27:37
Speaker
And it's all just like, we just grabbed the top five pictures from Google and then had a...
00:27:44
Speaker
text to voice, read it out for us.
00:27:48
Speaker
Yeah, which then, I mean, that muddies the water for algorithms that are based on that kind of feedback data, right?
00:27:54
Speaker
Like there's going to be, this thing is going to backwash on itself in a lot of ways, especially because it's not one, it's not one AI doing it.
00:28:02
Speaker
It's like a bunch of AIs that, you know, maybe theoretically content for each other.
00:28:09
Speaker
Right, right, right.
00:28:10
Speaker
Because if it was one, then you could just be like, hey, did you write that review?
00:28:12
Speaker
Okay, exclude that from your training set.
00:28:14
Speaker
But you can't do that.
00:28:19
Speaker
Yeah, yeah.
00:28:22
Speaker
It's kind of bleak when you think of that type of stuff.
00:28:25
Speaker
Although, I mean, to a certain extent, you know, I don't know.
00:28:30
Speaker
The problem is, like, what people don't really get about ChatGPT is the recommendations or the answers it gives.
00:28:37
Speaker
are not necessarily supposed to be accurate.
00:28:41
Speaker
They're just supposed to sound accurate.
00:28:44
Speaker
Right?
00:28:45
Speaker
So if you say, give me the best domain name for this idea, it would give you a domain name.
00:28:53
Speaker
It doesn't know if it's the best.
00:28:54
Speaker
It has no clue if it's the best.
00:28:56
Speaker
It's not like running a model on which of these has the best.
00:29:00
Speaker
It's not doing any kind of modeling or data science of its own.
00:29:04
Speaker
No, no, no.
00:29:05
Speaker
So yeah, so like, and the way, like, because they framed it as a chatbot, which maybe I can get into like the history of that, because they framed it as a chatbot, people interact with it as if it's like this intelligent bot.
00:29:19
Speaker
But what it is, all it is, is it takes your prompt, so what you type, and it tries to match an output to that, to what you typed as best as possible.
00:29:28
Speaker
So yeah,
00:29:29
Speaker
Whatever you write, it's going to try to match that as best as it can from what it's been trained on.
00:29:35
Speaker
But it has no clue whether or not any of it's accurate.
00:29:38
Speaker
Now, I've seen they've done some fine-tuning on the newest version, so GPT-4.
00:29:44
Speaker
That's supposed to be a little bit more accurate, but it still kind of has the same problems.
00:29:51
Speaker
Oh, yeah, so do go into the history of it.
00:29:53
Speaker
Tell me about how they...
00:29:56
Speaker
So part of this is is NLP kind of goes back to like the earliest days of computing, like back in like the 60s.
00:30:03
Speaker
People were like, let's train a bot to talk, you know, and they like IBM's like, we got it speaking, you know, and, you know, and it's always been a thing where like, you know, they've had, you know, how 9000 and, you know, in sci fi, they've wanted some kind of chat bot.
00:30:21
Speaker
That's that's been a thing.
00:30:24
Speaker
I built a chat bot before this technology was out.
00:30:27
Speaker
And essentially what you're, all you're doing is like looking for keywords and strings and then trying to match that with a, with an output that would be useful.
00:30:38
Speaker
And so sometimes you have certain like targets that you're trying to get people towards and you're, you're kind of building like a video game, you know, like a choose your adventure video game.
00:30:47
Speaker
That's just a little bit more advanced.
00:30:49
Speaker
Right.
00:30:50
Speaker
But yeah,
00:30:52
Speaker
with kind of the onset of machine learning and kind of the advancements in machine learning that kind of took place probably within like the last 15 years, they've started, you know, training stuff using machine learning.
00:31:04
Speaker
So like your Google Assistant, your Siri and your Alexa, they started coming out using these more advanced machine learning models.
00:31:12
Speaker
So one of the issues, though, is
00:31:22
Speaker
If you're not just matching words to certain responses, so you're like, if they say this word, we'll give them this response.
00:31:31
Speaker
It's kind of hard to work with text data because it's not numerical.
00:31:35
Speaker
So machine learning in general requires numbers.
00:31:39
Speaker
So you have to give it some numbers.
00:31:42
Speaker
So for instance, a classic machine learning problem is they took measurements of three different flower species.
00:31:50
Speaker
Right.
00:31:50
Speaker
So they measured the stem length, the petal width, the sepal length and stuff like that.
00:31:57
Speaker
And there's three different species they know in real life.
00:32:00
Speaker
And you can use a machine learning algorithm to look at all those different measurements of the various like hundreds of different flowers of each species.
00:32:08
Speaker
And then you can try to place them in which species is there without actually knowing the answer ahead of time.
00:32:14
Speaker
Yeah, I've actually used some of those apps.
00:32:16
Speaker
They're pretty good.
00:32:18
Speaker
They're pretty effective at correctly IDing.
00:32:22
Speaker
We made some jam in our yard out of some berries that we collected that we were...
00:32:28
Speaker
unsure what they were, but I was able to use like a couple separate apps to identify the barriers.
00:32:35
Speaker
And they're like, oh yeah, that's, that's autumn olives.
00:32:36
Speaker
You can eat those.
00:32:38
Speaker
And I said, thank you, computer.
00:32:40
Speaker
I'm going to feed these to my kids.
00:32:42
Speaker
Yeah.
00:32:42
Speaker
So that's probably some type of image, image based thing, right?
00:32:45
Speaker
So you take a picture and it'll search for it.
00:32:47
Speaker
Yeah.
00:32:48
Speaker
Yeah.
00:32:48
Speaker
Yeah.
00:32:49
Speaker
So, um, so yeah.
00:32:51
Speaker
And like, that's another thing is pictures.
00:32:53
Speaker
technically are hard to work with, right?
00:32:55
Speaker
Because you can't just say, here's a picture, do math on it, right?
00:33:00
Speaker
But yeah, it has to find the outline of the flower and recognize that it's a flower.
00:33:06
Speaker
And then it's got to start from like, is it a flower?
00:33:11
Speaker
And then be like, oh, it's this kind of flower, which is that's tricky because their shapes are so different.
00:33:16
Speaker
So it has to recognize the general case and then narrow down to the specific case.
00:33:21
Speaker
Yeah, yeah.
00:33:22
Speaker
You can also do it on birds really easily too.
00:33:27
Speaker
Something I found out.
00:33:28
Speaker
Take picture of a bird, it'll tell you the exact species.
00:33:32
Speaker
But yeah, so text and image are kind of related in that because they're considered what's considered unstructured data.
00:33:39
Speaker
So structured data being like percentages.
00:33:42
Speaker
And people intuitively understand that those are magical.
00:33:46
Speaker
Like when they see a computer work with text and image, they're very impressed.
00:33:50
Speaker
Yes, yes.
00:33:51
Speaker
And because that's like, you know, closer to your human senses, right?
00:33:56
Speaker
And so text and images have a lot of the same challenges.
00:34:01
Speaker
So for images, the way they turn an image into numbers is they just make a grid and number every pixel, right?
00:34:10
Speaker
And they say, okay, pixel one, what color is it?
00:34:14
Speaker
Right?
00:34:14
Speaker
It's this color.
00:34:15
Speaker
Pixel two, it's this color.
00:34:18
Speaker
And they just do that for however many pixels are in the image, right?
00:34:22
Speaker
And then they'll use neural nets to kind of look at the image from different, you could say, like zooms and like different levels of fidelity.
00:34:33
Speaker
And it tries to kind of predict what that image is using those neural nets.
00:34:38
Speaker
And you kind of train it on a data set that has every single flower, every single bird labeled with what they are and then the picture.
00:34:47
Speaker
From a bunch of different angles and.
00:34:50
Speaker
Um, yeah.
00:34:51
Speaker
And so, yeah, go ahead.
00:34:53
Speaker
Does it, I mean, that seems like that's gotta be really compute heavy.
00:34:59
Speaker
Like it's gotta take a lot of resources.
00:35:00
Speaker
So is there, how much of this is, cause people, people have been posting the, the Moore's law graph, right?
00:35:07
Speaker
The, it's like double exponential.
00:35:09
Speaker
It's, it's, it's a, it's an exponential function on a log scale.
00:35:14
Speaker
And, um, yeah,
00:35:17
Speaker
of these resources that are going up.
00:35:18
Speaker
And I wanted to get your take on like, do you think it's dramatically being opened up by the availability of new compute resources?
00:35:27
Speaker
Or are we just finding new techniques to, to look at the data?
00:35:32
Speaker
Yeah, I would say, um, it's kind of a combination.
00:35:39
Speaker
So, so
00:35:41
Speaker
So some of these techniques go back to like the 80s.
00:35:45
Speaker
Like they were discovered in the 80s and they just didn't have enough compute to really make use of them, right?
00:35:50
Speaker
So some of these models, so for instance, in text analytics, a popular model is a specific type of recurrent neural network called an LSTM, which stands for long short-term memory.
00:36:04
Speaker
So I believe that was like described and discovered in the 80s.
00:36:11
Speaker
And it really didn't come to prominence, though, until maybe like 2010, 2012-ish, when people kind of realized, oh, we have enough data, number one, so like training data, meaning enough text, like a large enough corpus of text to run through this model to train it to get to know the patterns.
00:36:31
Speaker
And then also enough compute to actually keep track of all those weights of every single word in the English language and things like that, or sometimes multiple languages.
00:36:42
Speaker
And yeah, so some of it was unlocked by the advancements in compute.
00:36:50
Speaker
And what you're kind of seeing, the differences between GPT-2, GPT-3, chat GPT, and GPT-4 are just, they're making them bigger.

Philosophical Perspectives on AI

00:37:01
Speaker
So they're just computing more.
00:37:02
Speaker
So they haven't really changed the architecture much.
00:37:06
Speaker
Although with GPT-4, I think they did.
00:37:09
Speaker
But they're just throwing more data at it.
00:37:11
Speaker
And as it gets more data, it kind of just gets smarter just by having seen more stuff, if that makes sense.
00:37:19
Speaker
Yeah, and my intuition, and I'm definitely a novice,
00:37:24
Speaker
But like my intuition is that, like you're saying, these are kind of boring when you look behind the curtain.
00:37:29
Speaker
It's like, it's just sort of like lots and lots and lots and lots of regression models with lots and lots of variables stitched together and washed through, you know, over and over again.
00:37:43
Speaker
And if you're a certain type of like rationalist materialist, you sort of say like, well, that's human cognition is as simple as that.
00:37:56
Speaker
And therefore, this thing will inevitably approach and surpass us in terms of its cognitive sophistication.
00:38:04
Speaker
But it's not obvious to me that that's what's happening.
00:38:10
Speaker
Like, it seems... I'm like, given how...
00:38:17
Speaker
much data is being churned and the, the, the scale of the resources that are being deployed.
00:38:26
Speaker
I'm like, this thing clearly is like faster in a certain sense than a human brain.
00:38:31
Speaker
Like it's, it's, it's got more resources.
00:38:34
Speaker
Right.
00:38:35
Speaker
Like it doesn't forget stuff.
00:38:38
Speaker
Right.
00:38:38
Speaker
And yet the results are, are not, not especially close to a genuine human cognition.
00:38:46
Speaker
Or at least not as impressive on a creative level, for sure.
00:38:52
Speaker
Right.
00:38:53
Speaker
Yeah.
00:38:54
Speaker
And so, I mean, diving into that, like, going back to my linguistics education, there's a faction in linguistics that are called the generativists, which are led by Noam Chomsky.
00:39:09
Speaker
This is what he's famous for.
00:39:11
Speaker
He said, he came up in like the 60s or 70s and said,
00:39:17
Speaker
human brain is like a computer.
00:39:19
Speaker
So, but for him, it was like a 70s computer, which meant data storage was very expensive.
00:39:28
Speaker
You know, these things were running on kilobytes of data at most, right?
00:39:33
Speaker
And so he's like, we don't have the capacity in our brain to keep track of every time we heard a word, right?
00:39:41
Speaker
So we just have this like engine, this kind of like script in our brain
00:39:48
Speaker
This universal grammar is what he called it.
00:39:51
Speaker
And we just process thoughts through that universal grammar.
00:39:56
Speaker
And that's how we make speech.
00:39:58
Speaker
So we actually don't store any words in our brain.
00:40:01
Speaker
We've just got this universal grammar embedded in there.
00:40:04
Speaker
Now, that's all well and good.
00:40:06
Speaker
It's more of a philosophical point, because if you look at the human brain, there's nothing in there.
00:40:13
Speaker
There's not a universal grammar.
00:40:14
Speaker
There's no hidden little
00:40:17
Speaker
script in your brain that's doing that stuff also there's a few problems um because he kind of based his research off of like knowing a few romance languages as opposed to knowing like all the various weird uh indigenous languages like hish cariana and stuff like that that totally break all these rules um we can talk about that some other time but uh but basically um
00:40:48
Speaker
It was more of a philosophical point and not really real.
00:40:51
Speaker
Now, where I studied at BYU, there's actually some professors who were doing some work saying, well, what if our brain just kind of retains as best as to its ability every time it's heard a word?
00:41:03
Speaker
So like every utterance, it's heard.
00:41:05
Speaker
Every word, it's read.
00:41:07
Speaker
You kind of retain that in your head and you learn associations through patterns and linkages in your brain.
00:41:15
Speaker
So like,
00:41:16
Speaker
You may not know the exact definition of a word, but you've heard it in certain contexts so much that you kind of understand what the word means.
00:41:24
Speaker
And and also it explains why like your accent doesn't change after a certain point.
00:41:31
Speaker
Right.
00:41:31
Speaker
So like if you're an American and you go live in Scotland at at thirty five, you're not going to start speaking with a Scottish accent necessarily, except maybe on like words you've never heard before.
00:41:43
Speaker
Hmm.
00:41:43
Speaker
Right.
00:41:44
Speaker
But if you're a kid, you've heard less words.
00:41:46
Speaker
And if you move, your accent may change, right?
00:41:50
Speaker
And so whereas there's not really an explanation for that under Chomsky's model.
00:41:55
Speaker
But anyway, so that's kind of what has taken place with NLP in recent years.
00:42:05
Speaker
So basically around 2012, they came up with this thing called NLP.
00:42:12
Speaker
word2vec or word vectors, right?
00:42:16
Speaker
And they said, instead of like, so previously the way they turned words into numbers was they would say, all right, the first word we see is going to be word number one.
00:42:27
Speaker
So it's going to be number one.
00:42:28
Speaker
That's how we're going to represent it.
00:42:30
Speaker
And so like they kind of did machine learning by saying, okay, this sentence has word one, word 64,
00:42:36
Speaker
Word 307.
00:42:37
Speaker
So that means it might be meaning this.
00:42:41
Speaker
And it was able to do some machine learning that way.
00:42:43
Speaker
There's a few other ways that they tried to do it, like one-hot encoding and other ways of encoding words.
00:42:50
Speaker
But that doesn't establish linkages between the words.
00:42:53
Speaker
Correct, because they're just arbitrary meanings.
00:42:56
Speaker
So you don't retain the semantic information.
00:42:59
Speaker
So the actual meanings of the words or whether it's related to another word or
00:43:05
Speaker
whatever so but with with word vectors what happened was they said okay let's um represent these words in a vector space now that that's like a math thing that
00:43:20
Speaker
might make sense if you're doing like multi-dimensional.
00:43:22
Speaker
Yes.
00:43:23
Speaker
One.
00:43:24
Speaker
Okay.
00:43:24
Speaker
So, so one dimension is a number line.
00:43:27
Speaker
Two dimensions is your X and Y chart.
00:43:30
Speaker
Nobody's going to see this video, but yeah, it's an X and Y chart.
00:43:33
Speaker
Three dimensions is that same chart in, in 3d.
00:43:36
Speaker
So you've got three.
00:43:37
Speaker
Got height with directions.
00:43:39
Speaker
So the way I explain it is, is you're giving the word an address, right?
00:43:44
Speaker
Yeah.
00:43:44
Speaker
So, so the address, for instance, like if you're in downtown New York,
00:43:49
Speaker
and you want to tell someone where to meet you, you have to tell them, okay, meet me on this street that crosses with this street.
00:43:57
Speaker
So that's your, your two dimensions.
00:43:58
Speaker
Right.
00:43:59
Speaker
And, and then meet me on the 30th floor.
00:44:03
Speaker
That's your three dimensions.
00:44:04
Speaker
Right.
00:44:04
Speaker
So now they know where to meet you.
00:44:06
Speaker
Now there are other dimensions such as like time in, in reality that adds a fourth dimension in,
00:44:14
Speaker
In this word vectorization, they're adding sometimes over a thousand different dimensions to these words.
00:44:19
Speaker
And they're essentially trained through the neural network.
00:44:23
Speaker
But all it's doing is it's trying to place the word in a space that is mathematically representable.
00:44:32
Speaker
Though hard to imagine once you get above three or four dimensions for the human.
00:44:36
Speaker
Right.
00:44:36
Speaker
Right.
00:44:38
Speaker
But what ends up happening is they found out, okay, if we put the word king...
00:44:44
Speaker
In, you know, we train it and we find out what its vectors are exactly, so we place it in an address.
00:44:52
Speaker
It ends up being closer to the word man than it is to the word woman, right?
00:44:58
Speaker
And then what they also found out is that the distance between man and king was the same as the distance between woman and queen, right?
00:45:09
Speaker
So these relationships also were preserved.
00:45:13
Speaker
And so what ended up happening is through this very simple process that's actually not super computationally heavy compared to other things they were doing, they were able to retain information about words based on how it was trained.
00:45:29
Speaker
So like I said, that came out in 2012.
00:45:31
Speaker
The next few years were spent...
00:45:36
Speaker
um some some of the bigger companies like google said well what if we just take all of wikipedia and pre-train a model on everything in wikipedia right and then they said what if we get more than that right and so that they came out with these models called like glove and like other models that that essentially were like we're going to pre-train on the english language so you're just going to have like
00:46:01
Speaker
it's already going to know what every word means before you even boot it up, if that makes sense.
00:46:07
Speaker
Yeah.

Education and Practical Skills in NLP

00:46:09
Speaker
And so that was kind of the start of these large language models, which is what GPT is one of them.
00:46:16
Speaker
But then they kind of just kept making them bigger and bigger.
00:46:20
Speaker
But then 2017, Google releases a paper called Attention is All You Need.
00:46:29
Speaker
And you can go look up the paper if you're interested.
00:46:31
Speaker
But basically, they said, we're going to use this thing in machine learning called the attention mechanism that essentially just allows these models, instead of using what they were using before, which were these LSTMs, long short-term memory, it's able to pay attention to larger contexts of words that rather than LSTMs kind of worked like a snowplow,
00:46:56
Speaker
where they had no clue what was in front of them.
00:46:59
Speaker
So that's just like virgin snow.
00:47:01
Speaker
They have never seen it before.
00:47:03
Speaker
And then what trails off behind them starts getting more snow on it and starts covering up again.
00:47:09
Speaker
So it kind of forgets.
00:47:11
Speaker
So it can only remember a certain window of words.
00:47:15
Speaker
And that's kind of how LSTMs work.
00:47:19
Speaker
But this attention mechanism allowed it to pay attention to what's in front, what's behind, and kind of what was three paragraphs ago.
00:47:26
Speaker
So that's why like when you do your predictive text on your phone, it's based off of these older models and it starts looping around because it forgets that it's already said something.
00:47:35
Speaker
Right.
00:47:35
Speaker
Right.
00:47:36
Speaker
Right.
00:47:36
Speaker
And then but this this attention mechanism, once again, it's a very like simple thing, but they just kind of figured out a way to do it.
00:47:45
Speaker
And and what it allows for is.
00:47:49
Speaker
it can pay attention to this context, and it actually does better at things like machine translation because it doesn't really care about word order as much.
00:47:59
Speaker
And so these models came out, and they were called transformer models.
00:48:06
Speaker
So the T in GPT stands for transformer.
00:48:11
Speaker
What are the G and the P?
00:48:13
Speaker
What's that?
00:48:14
Speaker
What are the G and the P?
00:48:16
Speaker
They are generative pre-trained
00:48:20
Speaker
uh transformer model so generative because it's kind of intended to generate text right um and then pre-trained like it's a large pre-trained language model um and so yeah that's what uh so like gpt came out then quickly gpt2 came out that was like in 2019 and then gpt no gpt3 came out in 2019
00:48:43
Speaker
And so I knew about it back then.
00:48:45
Speaker
And I was like, this is really cool.
00:48:47
Speaker
This is cool stuff.
00:48:48
Speaker
It can do cool stuff.
00:48:49
Speaker
I got access to the API.
00:48:51
Speaker
I was like playing with it.
00:48:53
Speaker
Then, you know, last, what was it like last summer or something?
00:48:56
Speaker
They were like, hey, we're coming out with chat GPT.
00:49:00
Speaker
Now, what that was, was the model released that was released in 2019.
00:49:06
Speaker
packaged as a chat bot for the general public to use without an API token access, if that makes sense.
00:49:14
Speaker
Yeah.
00:49:15
Speaker
Right.
00:49:16
Speaker
So this big trendy thing that came out last year was actually a few years old, and they just came out with an interface for it.
00:49:23
Speaker
That's all it was.
00:49:25
Speaker
Ah, so yeah, that's been kind of my intuition as I've talked to ChatGPT and as I've seen its output.
00:49:34
Speaker
It's kind of like, it's not necessarily that there's been this quantum leap in what AI can do.
00:49:43
Speaker
It's more just that like lots of us who don't know how to code are getting to access some of those tools.
00:49:50
Speaker
Yeah, yeah.
00:49:51
Speaker
Yeah, yeah.
00:49:52
Speaker
Which that seems to me like a pretty good thing.
00:49:54
Speaker
Yeah.
00:49:55
Speaker
And GPT-2, for instance, is a dumber version of it that's fully open source that you can just use however you want.
00:50:04
Speaker
But yeah, so chat GPT is just GPT-3 with a nice little wrapper around it for the layperson to use.
00:50:14
Speaker
Which is why also like when people use it, like I said, like they're trying to treat it like a bot and they're like, hey, Alexa, what's the weather today or whatever.
00:50:23
Speaker
But how you should really treat it is you're giving it an input and you want to get the output based on what you tell it.
00:50:29
Speaker
So like some people in a chat we were in, they were complaining like, well, it writes dumb, like it doesn't write at a very high level.
00:50:39
Speaker
And I'm like, well, did you tell it to write at an academic level?
00:50:43
Speaker
And they're like, no, I just said, explain to me what Bigfoot is or whatever, you know, I don't know.
00:50:49
Speaker
And, but you have to tell it what style you want it to write in.
00:50:52
Speaker
If you want it to sound like an author, you can say, Hey, can you sound like this author?
00:50:57
Speaker
It's great at that.
00:50:58
Speaker
It's great.
00:50:59
Speaker
Yeah.
00:50:59
Speaker
Yeah.
00:50:59
Speaker
It's great at mimicking styles and things like that.
00:51:02
Speaker
So.
00:51:03
Speaker
So kind of like trying to learn how to use it, you kind of have to not think of it as a chat bot.
00:51:08
Speaker
You have to think of it as like, okay, I'm giving it an input and I want a specific output.
00:51:13
Speaker
So how do I get that output?
00:51:15
Speaker
Yeah.
00:51:16
Speaker
Yeah.
00:51:16
Speaker
Do you see a, a future for like the prompt engineer?
00:51:21
Speaker
Do you think that that's a, a valuable skillset to learn?
00:51:25
Speaker
I've seen they're like paying like two to 300,000 for prompt writers now and,
00:51:31
Speaker
I think that could be a real job that happens, though it seems like a meme at the current moment.
00:51:37
Speaker
Like I would put all my eggs in that basket.
00:51:39
Speaker
But but I do think if you are at some kind of company and if you spend some time like figuring out how to write prompts and then told, you know, higher ups at your company, hey, I'm a good prompt writer.
00:51:54
Speaker
It might work out well for you, you know.
00:51:57
Speaker
Yeah, you'd probably just have to show them some outputs, right?
00:51:59
Speaker
Like I told it to do this and it did this.
00:52:03
Speaker
And I'm actually thinking of running a prompt writing webinar as part of my thing.
00:52:12
Speaker
I would attend your prompt writing webinar.
00:52:13
Speaker
I think that's... Because I have sort of some vague intuitions about how this thing could be used, but I haven't been able to...
00:52:23
Speaker
I haven't been able to make it do exactly what I want.
00:52:25
Speaker
And I can tell that I'm missing something about the architecture.
00:52:29
Speaker
Like I'm not catching what it's designed to do because I've seen it do some pretty extraordinary things.
00:52:36
Speaker
And I, I, I wonder if, so if somebody is coming out of college, right.
00:52:41
Speaker
And they're like, I want to, this AI stuff is cool.
00:52:46
Speaker
would you actually at this stage of the game with the tools as they currently exist, would you actually tell someone, Hey, linguistics is a pretty valuable thing to understand.
00:53:00
Speaker
Yes.
00:53:02
Speaker
Um, I'm very pro linguistics, but that may be, uh, my own bias, but like, I mean, like, you know, obviously like I think econ's cool, but I would never tell someone to major in economics.
00:53:14
Speaker
Right, right.
00:53:14
Speaker
So I was actually talking to my little nephew who just got accepted to BYU.
00:53:20
Speaker
And so he was asking me about AI stuff.
00:53:23
Speaker
And I was talking to him about exactly this.
00:53:26
Speaker
If I had to do it again, I would have majored in linguistics and minored in computer science.
00:53:32
Speaker
Just to get more of that code credentials behind me.
00:53:36
Speaker
But...
00:53:41
Speaker
That being said, there's very few there.
00:53:45
Speaker
I don't think there's any colleges that offer NLP as a as an undergrad focus.
00:53:53
Speaker
BYU being one of the only schools I know of that even has NLP courses in their undergrad program for linguistics.
00:54:01
Speaker
So most schools where you're learning linguistics, you're just going to learn Chomsky philosophy.
00:54:07
Speaker
And it's a you know, it's a Bachelor of Arts that you're getting.
00:54:10
Speaker
Right.
00:54:11
Speaker
But just due to the fact that BYU is like very anti-Chomsky for some reason, all the professors there hate Chomsky.
00:54:19
Speaker
I actually, I mean, so maybe we can, we can talk about this a little bit.
00:54:24
Speaker
I actually, you know, despite all of the sort of press around BYU and, and some of these professors who suck really bad, like most of the programs that I've been involved in or seen firsthand, um,
00:54:40
Speaker
they're, you know, they're, they're, they're lived because it's academia, but even, even to the granular level of like, are we about Chomsky or not?
00:54:49
Speaker
They're surprisingly like,
00:54:52
Speaker
clear headed.
00:54:52
Speaker
And, and like, because what you're saying about what you're saying about their approach to linguistics and like, Hey, let's teach people some actual freaking marketable skills that are deployable around this subject.
00:55:05
Speaker
Uh, econ was the same way.
00:55:06
Speaker
And like, so actually if, if I could make every employer understand what you learn at the BYU economics department, uh,
00:55:17
Speaker
I think that would be a really marketable thing.
00:55:19
Speaker
The problem is it's called an economics degree, which at most schools is basically just like the communist manifesto, like literally.
00:55:28
Speaker
And so, because we did regression analysis, we did like lots of data work.
00:55:36
Speaker
It was kind of a proto data science degree.
00:55:38
Speaker
And that was, honestly, that was like kind of what made me think, that was my first experience
00:55:46
Speaker
jaunt into like, oh, these are business questions.
00:55:50
Speaker
These are like technical questions.
00:55:52
Speaker
These are empirical questions that I actually find interesting.
00:55:55
Speaker
And I could actually, you know, build a career around this.
00:55:58
Speaker
I wouldn't want to, you know, neck myself all the time.
00:56:01
Speaker
And, and I think, uh, so, so yeah, that's sort of you saying that about the linguistics program makes you actually makes me kind of affectionate for our, uh, our shared alma mater a little bit.
00:56:13
Speaker
Yeah, yeah.

Shaolin AI: Bridging Academic Gaps

00:56:14
Speaker
Well, it was a very unique place for linguistics, like I said, and I kind of stumbled into it just trying to be lazy in school.
00:56:21
Speaker
So, but yeah, so this is, and maybe this is a good transition, but this is part of the reason why I've started Shaolin AI is for undergrad, you can't take a data science undergrad, even though there's a million jobs for it, right?
00:56:43
Speaker
you can't do a data analytics undergrad.
00:56:46
Speaker
There's a computer science degree where you may do that in one or two classes, but you're gonna spend 50% of your time learning C
00:56:54
Speaker
which you don't need to know if you're going to be a Python coder or an R coder, right?
00:56:58
Speaker
You don't need to know base machine code and the theory behind computational stuff.
00:57:06
Speaker
If you're just going to be running packages, who cares?
00:57:09
Speaker
The curriculum is definitely built around the tastes of nerds of a certain flavor.
00:57:15
Speaker
Right, and it's very academic in its nature, so you're learning...
00:57:19
Speaker
Like academically, why is this important when really you're trying to get a job as a data scientist?
00:57:24
Speaker
Why are you wasting time learning a skill that you're never going to use right now?
00:57:29
Speaker
If you're going to be a software engineer, yeah, you need to do the computer science degree.
00:57:33
Speaker
But we've got these this whole new class of jobs, not to even mention the A.I.
00:57:38
Speaker
class of jobs.
00:57:40
Speaker
That's like one step beyond data science.
00:57:42
Speaker
Right.
00:57:44
Speaker
that like you can't even get an education on in your undergrad if you want it.
00:57:49
Speaker
Now, there are plenty of like master's programs and things like that around, but you'd have to sit through four years, then go pay another however many millions of dollars to get your master's in NLP or data science.
00:58:03
Speaker
And once again, even there, a lot of those programs are very academic and research focused and not really trying to get you to get a job in the industry.
00:58:12
Speaker
So like, that's a problem like,
00:58:14
Speaker
We'll look to get PhD level people on our teams.
00:58:17
Speaker
And it's like, you know, they come out and they have no business sense.
00:58:24
Speaker
They have no clue what we're trying to do.
00:58:26
Speaker
And they're just like, well, this is an interesting thing I could write a paper on.
00:58:29
Speaker
It's like, well, we're not writing papers, you know.
00:58:32
Speaker
So like...
00:58:35
Speaker
Yeah, there's kind of this whole disconnect between the pedagogy of of universities and the actual jobs that are out there that are like fun, good paying jobs.
00:58:47
Speaker
I don't know.
00:58:47
Speaker
Yeah, yeah, it's and so to introduce that a little bit, you you came to exit with this concept for a boot camp because you you run boot camps professionally, right?
00:59:02
Speaker
You you facilitate
00:59:04
Speaker
some boot camps.
00:59:05
Speaker
So you know how the curriculum runs and you and I have compared notes on what your boot camps that you teach are like and what the boot camps I attended were like.
00:59:13
Speaker
And it's, they're all the same.
00:59:14
Speaker
It's, it's, there's a, there's a very clear, very well-defined set of things that every data science program needs you to understand.
00:59:23
Speaker
And it's basically like a couple of weeks of basic stats and
00:59:28
Speaker
a couple of weeks of Python, and a couple of weeks of let's bring the stats and the Python together.
00:59:35
Speaker
And then you do a capstone project.
00:59:39
Speaker
And so you had the idea of let's jump into that environment with... Because there's not like one solution to this.
00:59:50
Speaker
There's like...
00:59:55
Speaker
It's like mowing lawns.
00:59:56
Speaker
Like you're not going to run out of lawn care businesses because you need to have people to babysit the coders, to babysit the learners, right?
01:00:04
Speaker
So this is a business model where, you know, even though you've heard of a million different data science boot camps, there's not enough data science boot camps because it doesn't scale up.
01:00:18
Speaker
It doesn't universalize.
01:00:20
Speaker
Yeah.
01:00:22
Speaker
And yeah, so a few years ago, I started teaching boot camps, right?
01:00:27
Speaker
So I'm teaching people how to code in Python and trying to get them maybe a data analyst job.
01:00:33
Speaker
If they have some type of degree, maybe a data science job, right?
01:00:39
Speaker
Or if they're just super...
01:00:40
Speaker
with it you know and i and i've seen a bunch of success from it even though like the curriculum i was going through was sometimes frustrating in its quality i guess um and so you know i was sitting there and i was like um but but then also i i kind of was at a different job at the time and i had this this kind of existential dread of going to work in the mornings but then i was teaching a boot camp at night
01:01:09
Speaker
And I looked forward to it every night.
01:01:11
Speaker
It was fun.
01:01:12
Speaker
I enjoyed it.
01:01:13
Speaker
And it turns out I'm kind of good at teaching, I guess.
01:01:21
Speaker
And kind of where the idea sprung from initially was just how much I found out how much they were charging the students and then how much I was getting paid to teach the boot camp, which was like one and a half students worth.
01:01:36
Speaker
Right.
01:01:36
Speaker
And you were teaching like 30.
01:01:38
Speaker
And I was teaching like 30.
01:01:39
Speaker
Yeah.
01:01:39
Speaker
So I'm like, well, if I just get like half of this, like I'm I'm good to go.
01:01:46
Speaker
And so so so so.
01:01:48
Speaker
Yeah.
01:01:48
Speaker
So I came up with the idea.
01:01:50
Speaker
The business name is is Shaolin AI.
01:01:52
Speaker
You can visit the website.
01:01:53
Speaker
It's Shaolin dot AI named after the Shaolin Monastery or the Shaolin Monks because I did Shaolin Kung Fu growing up.
01:02:02
Speaker
And I think it's cool.
01:02:03
Speaker
Nice.
01:02:04
Speaker
I don't know.
01:02:05
Speaker
But yeah, also, it's about training.
01:02:09
Speaker
It's about all that stuff.
01:02:11
Speaker
Discipline.
01:02:12
Speaker
Courage.
01:02:13
Speaker
Courage.
01:02:14
Speaker
Yeah.
01:02:14
Speaker
Yeah.
01:02:17
Speaker
And so anyway, I came up with the idea that, hey, I could make a much better curriculum that's even more relevant to actual jobs.
01:02:29
Speaker
And I have my
01:02:31
Speaker
A.I.
01:02:32
Speaker
credentials and specialty and also there was a lot of guys in exit who have similar credentials or or maybe slightly divergent ones, but who kind of are interested in in getting either a side income or or branching out a little bit.
01:02:48
Speaker
And I was so so we've kind of put together a whole team from.
01:02:55
Speaker
From exit and elsewhere.
01:02:57
Speaker
of people that are contributing to the development of the bootcamp curriculum and kind of the behind the scenes side of things to make sure we have like jobs lined up for graduates and things like that.
01:03:11
Speaker
So it's kind of been a really cool project and a really cool aspect of Exit where like all these guys have come together just to help me out and
01:03:21
Speaker
Currently, we're not making any money, but people are just helping out because they're interested in the project or potentially down the line.
01:03:30
Speaker
It will come back to them where they'll have an opportunity to TA or teach at the bootcamp.
01:03:36
Speaker
And it's been a really cool process.
01:03:38
Speaker
So yeah.
01:03:39
Speaker
Awesome.
01:03:39
Speaker
So yeah, so we've got from the group, you've pulled play testers, you've pulled some people to talk to you about the business side, the marketing side, and you've pulled people who...
01:03:49
Speaker
are maybe either helping you develop the data science and NLP curriculum, or maybe do you have people working on any other curricula at the point?
01:03:56
Speaker
Or are they focused on the data science and NLP?
01:03:58
Speaker
We are.
01:04:00
Speaker
So the plan right now is we're trying to launch the data science curriculum next month is when we're looking to go live by the end of the month.
01:04:10
Speaker
And so that's where the major focus has been.
01:04:12
Speaker
But we've also got a bunch of web guys and other guys.
01:04:16
Speaker
And kind of the idea is we're trying to say,
01:04:19
Speaker
you know, there's a lot of bootcamps that can teach you how to do other skills, but like what will get you a job now and what's relevant to now, especially in kind of this AI landscape and stuff like that.
01:04:31
Speaker
So, so including additional stuff.
01:04:34
Speaker
So we have started working on additional bootcamps.
01:04:37
Speaker
So like crypto and, and, um,
01:04:41
Speaker
web development and things like that.
01:04:44
Speaker
So those will be coming shortly, but we're kind of focusing on the first one first because that will kind of get us out of the gate without complying things.
01:04:55
Speaker
Well, and I think that there's something to be said for as these tools become more sophisticated, and I'm not even necessarily talking about GPT, but just like the availability of some of these libraries to...
01:05:09
Speaker
you're technically sophisticated, but you're not a computer science guy.
01:05:12
Speaker
You're a word cell, right?
01:05:14
Speaker
Like myself.
01:05:15
Speaker
And I think genuinely one of the things that these tools open up is the ability for people like you who have, like a guy like you and a guy like me
01:05:30
Speaker
30 years ago, we would have had like the pedagogical gift to explain this material, but we wouldn't actually understand it because it would be, you know, you'd have to be kind of a computer science guy to follow along.
01:05:48
Speaker
And I think computer science guys have no ability to communicate.
01:05:51
Speaker
None, none whatsoever.
01:05:53
Speaker
And so, and so creating by, by just shortening that gap and allowing you guys like you guys like me to jump across it.
01:06:01
Speaker
I think it facilitates just a whole universe of, of transactions that, that, that produce a lot of value.
01:06:07
Speaker
So that's, that's really exciting, man.
01:06:09
Speaker
Can you tell me a little bit about how, how you use the group?
01:06:14
Speaker
Cause like we didn't have at the time,
01:06:17
Speaker
like a protocol to hand you for like, Hey, this is how you find your partners and this is how you get started.
01:06:24
Speaker
So can you tell me about that process?
01:06:26
Speaker
Yeah.
01:06:27
Speaker
Yeah.
01:06:27
Speaker
And it's been kind of a learning process because, you know, I'm not necessarily, um, you know, a hardcore, um, entrepreneur in any sense, you know, I've just been a W two guy my whole life, you know?
01:06:43
Speaker
So, so a lot of this stuff was a learning curve for me, but, um,
01:06:48
Speaker
What was cool is we kind of just described the goal and and some people just showed up from that.
01:06:58
Speaker
And so people who are interested in in this the goal and and they've been super helpful.
01:07:05
Speaker
And, you know, I've gotten to know a lot of guys in the group much better through this process.
01:07:10
Speaker
But then also, yeah.
01:07:13
Speaker
kind of as we hit roadblocks or gaps in our knowledge base of what we had, we'd kind of throw it out to the different specialties.
01:07:23
Speaker
And there's enough people in Exit that, you know, someone knows something about something.
01:07:27
Speaker
So, yeah.
01:07:29
Speaker
So, yeah.
01:07:30
Speaker
So when I didn't know how to start an LLC, it turns out a couple of guys had like a bunch of things for doing that.
01:07:37
Speaker
So like the business side has been kind of cool.
01:07:39
Speaker
And then,
01:07:40
Speaker
Then working on the technical side, it turns out we have some other really cool AI NLP heavy hitters in the group that have stepped up to offer advice.
01:07:53
Speaker
Some really impressive technical guys, for sure.
01:07:55
Speaker
Yeah, I'm not going to name drop or whatever, but yeah, they're up there.
01:08:01
Speaker
So yeah, it's been really cool just to see who shows up.
01:08:06
Speaker
Some of the challenges have just been
01:08:09
Speaker
you know, being in this remote group, how do you like keep the communication lines open?
01:08:16
Speaker
How do you make sure everybody knows what's expected of them and what's not expected of them and stuff like that.
01:08:23
Speaker
But we've kind of been figuring out, we've kind of settled into like a core team now, I feel like.
01:08:29
Speaker
And I'm hoping that this project will eventually also give back to either the guys who join Exit who are interested in
01:08:39
Speaker
getting coding skills like they can come to a exit approved coding camp.
01:08:44
Speaker
Yeah, absolutely.
01:08:47
Speaker
You know, and also just, you know, potentially getting guys hired as TAs so that they can get experience in the industry.
01:08:55
Speaker
So like they have like a company with AI in the name on their resume and they can go right there and and and get out there.
01:09:03
Speaker
So.
01:09:04
Speaker
So, yeah, it's been cool.
01:09:06
Speaker
Yeah, well, I really appreciate you coming to talk about it.
01:09:10
Speaker
And so you guys, we're going to release it next month, though.
01:09:14
Speaker
So Shaolin.ai.
01:09:17
Speaker
The site's up.
01:09:18
Speaker
Yeah, the site's up.
01:09:19
Speaker
You can sign up right now.
01:09:23
Speaker
If you have more questions, basically just go to the website, sign up, and we'll go over it.
01:09:28
Speaker
The schedule is essentially three days a week outside of normal working hours, so you keep your day job.
01:09:36
Speaker
if you want, while you, while you learn how to code, essentially it's designed for beginners.
01:09:44
Speaker
But also if you have maybe intermediate skills and you're looking to maybe add kind of the machine learning and AI to your portfolio, it would be worth it.
01:09:53
Speaker
And it's a six month bootcamp.
01:09:55
Speaker
So after six months, you should, you'll essentially have like a really fully fledged GitHub that can be used as your portfolio for getting a new job and
01:10:04
Speaker
And we're new, but I've been teaching boot camps for a while.
01:10:09
Speaker
And we've had a lot of success stories of students that I've taught that have, you know, I had one guy who was a warehouse worker, you know, stuffing boxes with stuff.
01:10:19
Speaker
And now he's working as a data analyst at a tech company making a lot more money than he was.
01:10:25
Speaker
And we've got a bunch of stories like that.
01:10:29
Speaker
Well, it's really exciting, man.
01:10:31
Speaker
Like, I love to see these kinds of things take off because, you know,
01:10:34
Speaker
I love the way that it nourishes the group, right?
01:10:37
Speaker
Like number one, these guys, yeah, the guys who have the knowledge can teach, the guys who want to learn can show up and be part of it.
01:10:47
Speaker
And I think, so I just, I really appreciate you taking the initiative to set it up, man.
01:10:53
Speaker
And it's great to hear from you.
01:10:55
Speaker
So thanks for being here, man.
01:10:57
Speaker
Yeah, thank you.
01:10:59
Speaker
All right.
01:11:00
Speaker
Yeah, if you want to,
01:11:02
Speaker
get involved with what he's doing at Shaolin.ai.
01:11:05
Speaker
If you have a project like this that you want to come build in the group, we definitely have the lawyers and the accountants and the technical and basically any expertise that you want to start a business.
01:11:18
Speaker
We've got the guys in here and many of them are looking for projects to attack.
01:11:26
Speaker
But you can learn about all that at exitgroup.us and come check us out.