Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
The community- building manifesto | Kunal Jain @ Analytics Vidhya  image

The community- building manifesto | Kunal Jain @ Analytics Vidhya

Founder Thesis
Avatar
197 Plays1 year ago

A community startup is built on the back of a strong organically grown community base, hyper-focused on adding value to the community. Kunal Jain is one of the early pioneers in community building and he shares his journey of starting a community, bootstrapped, and how it became a full-fledged platform for analytics professionals with a global reach.

For more such interesting founder journeys, subscribe to our newsletter founderthesis.com

Read more about Analytics Vidhya:-

1.Demand for data science professionals continuously increasing: Analytics Vidhya

Recommended
Transcript

The Rise of Community Startups

00:00:16
Speaker
As social media platforms and tech companies are increasing privacy protections, brands are realising that it is no longer that easy to pay dollars to reach consumers. And this has led to the rise of community start-ups. A community start-up is a start-up built on the back of a strong, organically grown community base, and that is hyper-focused on adding value to the community.
00:00:22
Speaker
Hi everyone, this is Kunal, founder of Analytics.
00:00:37
Speaker
At this stage, you might wonder what are the ways in which a community startup has become a large business. And that's where the story of Analytics Vidya comes in.

Building Analytics Vidya: Kunal Jain's Journey

00:00:46
Speaker
In this episode of the Founder Thesis Podcast, your host Akshay Dutt interviews Kunal Jain, who is one of the early pioneers in community building and the founder of Analytics Vidya.
00:00:55
Speaker
Kurao shares his journey of starting a community that bootstrapped its way and how the community became a full-fledged platform for analytics professionals with a global reach. Stay tuned and subscribe to the Founder Thesis Podcast and any audio streaming app to learn from veteran startup founders who have changed the landscape.

Entrepreneurial Culture at Capital One

00:01:22
Speaker
And then went to IIT Bombay for five years, studied aerospace engineering, bachelor's and master's. And post IIT Bombay, I went to Capital One in UK, spent four years with Capital One.
00:01:38
Speaker
And Capital One has given birth to a lot of entrepreneurs. Quite a few founders I've interviewed have worked in Capital One. So I guess it has that kind of a culture which promotes entrepreneurial thinking. Yeah, very much. In fact, as a culture, it promotes open thinking. It promotes people challenging each other very openly. And it's still one of those organizations where I see a lot of people
00:02:06
Speaker
venturing out from Capital One and then realizing, oh, the world is not safe and coming back. So a very high comeback rate of Capital One alumni. And so what next after your Capital One stint? So I spent about four years at Capital

Building Aviva's Analytics Team

00:02:24
Speaker
One. Finally, in 2010, I came across an opportunity where Aviva Life Insurance was setting its analytics team.
00:02:32
Speaker
And also, I mean, UK was still struggling with the entire recession and the growth seemed far. So I said that I've got experience of a more mature market. Why not now set this team up at Aviva. So that's when in 2010, I moved to Aviva Life Insurance, started this team, which initially it was just two people. And that scaled to about 25 people by the time I left.
00:03:00
Speaker
And at that time we had created one of the best analytics unit in the country in insurance. So the kind of problems we were working on, the way we had structured data. What was the impact of this unit? Give me some examples.
00:03:16
Speaker
So we started in 2010. In 2012, the agent productivity and the agent selection was entirely driven by data. So every agent when they fill their application, we had an engine running at the back saying what are the chances that this person would become a good agent.
00:03:38
Speaker
And their day-to-day engagement with their sales managers and branch managers was driven by data. So in those four years, I worked very closely with the entire sales team, the banking partners. So we created a model for one of the bank partners to tell them who in their customer base
00:04:03
Speaker
I would have high propensity to buy insurance from Aviva. So we essentially worked as an analytics team at the banks. But what happened was during this period of scale, my role ended up becoming very managerial to the extent that I was spending, I think, good 70% of my time in meetings.
00:04:25
Speaker
remaining third print up in a lot of managerial tasks. So I had stopped enjoying that role. I was feeling that, you know, I'm not growing as much on my technical front.

From Blog to Community Platform

00:04:37
Speaker
And in 2013, finally, I said that let me start a blog and share my learning. So another particular instance I remember was, as we were trying to see if macroeconomic variables have an impact on
00:04:53
Speaker
the renewal payments which customers make. So if they call me, go south, what does it mean for the renewal premiums which would come in? And there was no place I could go and ask these questions. So I ended up calling a few batch mates from IT days who I knew were working in analytics.
00:05:15
Speaker
So this need of having a place to share learnings, ask questions, get answers from experts kept coming back again and again while I was at Aviva for about four years. So in 2013, I said, OK, I don't know the way to solve it, but at least let me start by putting a blog out there where I can share my learnings with the larger world.
00:05:39
Speaker
And that's how Analytics with there started. So 2013, I booked the domain. I started writing articles. And from there, it started taking its own life in shape. So people started coming back. A couple of friends said that we think this is brilliant. We would want to also contribute. And in about nine months, so I did this in parallel to my job for about nine months.
00:06:06
Speaker
It was very evident that I could create a much larger impact by doing this full time. But there were a lot of uncertainties in terms of, for example, there was no revenue models figured out by that time. But what I could see clearly was that the impact is much bigger.
00:06:23
Speaker
And because I had savings, so I could go on without revenue for a couple of years. So I said that, let me take a plunge and do this full time. So 2014, I put down my paper sets and we started building analytics there full time. What was the hits or users or what were some of those stats when you quit?
00:06:48
Speaker
Yeah. So on my last day, in that month, we had got some 8,000 visits a month. So that was the number. In fact, I remember very clearly one of my friends at Aviva asked us that it's a good personal block to start with. You're getting some visits, but how big do you think would it become?
00:07:10
Speaker
And at that point, my answer was, if this grows 4x in a year, I'll be happy. So we ended up going 10x that year on the middle of the problems. So very early days, but that impact was, I could feel that impact, right? People coming back, asking questions on
00:07:30
Speaker
comments and then the engagement on let's say LinkedIn or other platforms. So it felt that we are onto something. It's just that we were not able to define it or clearly see how it would essentially become a business. So what was the plan when you quit? You wanted to build traffic and monetize through ads or what like?
00:07:53
Speaker
At that time, it was just that, let me convert this into a community. So we added discussion portals as soon as I went full time. And monetization was essentially through ads. So there was no other monetization. In fact, the first revenue came three months after I had quite a VY and was doing this full time.
00:08:12
Speaker
So the thought was to convert it into a community portal. We launched these discussion portals. Then we added a hackathon platform so we could release problem statements and community would work on it.

Transition to Media Company

00:08:27
Speaker
We started doing meetups across the country. So we would do a meetup. In each meetup, we'll take a problem. We'll help people solve that through.
00:08:38
Speaker
2017 with India's largest AI and ML conference in Bangalore.
00:08:43
Speaker
2018, we launched our own courses and programs. So till 2017, it was essentially like a media company. You were doing everything which a media company does, like producing content and running events. These are typically how media companies monetize, like advertising on content and events would have been paid events, I'm guessing, or sponsored events. So paid events and sponsored events. So it was actually till
00:09:10
Speaker
2018 it was a media model in terms of business and then the DNA was that we let's continue to build a strong user base and community. And by 2018, what were the numbers?
00:09:25
Speaker
So 2018, we would be getting probably a million visits a month. That's where we work. And in fact, we delayed launch of our courses for a good year or so, because we felt that the community might all of a sudden feel that we are now trying to sell products. So we deliberately
00:09:52
Speaker
Delayed or we were not sure whether we should be launching those are not, but at the same time, the most common question we used to get at that time was that can we so we really love the blogs you are writing and the way you are breaking down complex concepts.
00:10:08
Speaker
What do we do next from here? And then how can I take this learning further? So finally, in 2018, we said that looks like our users only want this and they're not getting it elsewhere. So let's launch courses on the platform.
00:10:23
Speaker
But and what was your before you lost courses what what were you making revenue wise was it breaking even for you or did you have to raise funds to meet expenses and did you have a team like tell me a little bit more about that 2014 to 2018 journey.
00:10:39
Speaker
So we always operated the business as a bootstrapped business. So I did raise small friends and family around when I was starting. And the idea was very simple, that it gives you longer runway. You are anyway dipping into your savings. So you might as well lengthen the runway.
00:10:59
Speaker
So I raised a small round in 2015 through my friends and family. But we were very lean team. So in fact, this 2017 conference, which we did, so we were 13 people at that time. None of us had any experience of running a professional event. And these 13 people would have all been content creators, like writing articles, blogs.
00:11:26
Speaker
mostly content creators, a couple of developers, and we had just hired first product manager and one salesperson, so very lean team.
00:11:40
Speaker
Salesperson for getting sponsorships of events or selling ad space. And hackathons.

Role of Hackathons in Engagement

00:11:47
Speaker
So hackathons, we had started in 2015. So we would essentially have these, let's say, annual kind of arrangements with companies. So at that time, a lot of tech companies used to be our clients, so Great Learning or some of these companies.
00:12:04
Speaker
would advertise on our portal. So the salesperson would essentially look at these relationships and see how can we go with this. And hackathons were, were they like a source of hiring? Like companies would hire through hackathons and you would charge them something to help them hire.
00:12:23
Speaker
So, it started as an engagement tool, but very quickly we saw that it could be used for hiring, it could be used for crowdsourcing solutions. So, if a company is facing a problem, they could put it in the community would
00:12:41
Speaker
So that's what we were doing. So for a company to come, it was essentially three reasons. They could look at hiring, they could look at crowdsourcing, or they could even do it for branding. That we are doing some interesting cool work. And when we have worked with almost every
00:13:01
Speaker
big company out there in analytics and data science to conduct these hackathons. And I still believe that for the community, they're a phenomenal tool, right? It just focuses all the energy on problem solving. And typically on one weekend, right? So we would have these weekend hackathons. The community was... And this is an offline thing, like the hackathon.
00:13:28
Speaker
No, it was online, and there were hybrid ones as well. But typically, over a weekend when there is a hackathon happening, the community would exchange between 10,000 to 15,000 messages about the problems and the learnings

User-Generated Content and Quality Control

00:13:44
Speaker
or what they're trying. So in fact, those weekends used to be crazy, right? So you would start the hackathon at 12 at night.
00:13:52
Speaker
And then you'll see a surge of people coming and downloading the data set, looking at the problem statement. And then, you know, till like three o'clock in the night, people would still ask questions about the data sets. And we would reply them on Slack or discussion portals. And then people would start building their models and see the leaderboard.
00:14:17
Speaker
What is the problem they are solving in a hackathon? You would give them a dummy dataset and ask them to draw some
00:14:26
Speaker
The idea was to have these real life problems as close to real life as possible. So at no point have we created dummy data for hackathons. So we would work with companies to bring their problems out. At times the companies would want masking of data. So we would do that masking in their own campuses, et cetera. But yeah, so personally identifiable information would get masked.
00:14:54
Speaker
Yeah, you would get all of that out and you would leave any sensitive information out, but still retain the flavor of a data problem. And then the intent was to create something which the companies can then take later on and solve their problems. So we were working closely with companies, bringing these problems and releasing them in form of closed problem statements. So it was not
00:15:20
Speaker
open problem statement like we usually have in a lot of hackathons, but it was a closed problem statement. And yeah, that worked brilliantly for the community members to come and work on a common. And what was like, give me an example, what kind of problem? So for example, this was for a, let's say insurance aggregator, right? So we were looking at what are the
00:15:46
Speaker
lead propensity that they would buy an insurance product from these companies. So we had lead data from these customers, their behavioral data. So for example, how many times have they visited the product pages, et cetera, and yet to predict whether they would buy insurance in a particular time frame.
00:16:09
Speaker
There was another problem from a retail giant. They wanted to predict sales of specific items during the holiday season. So that was another problem. So again, most of these problems were very close to what people were working in their real life. The only difference is we would do the hard work of collecting the data, making sure that the data is at least at a place where it could be released in form of a hackathon.
00:16:39
Speaker
and then open it up to the community. So that's the end of the statement. So yeah, these 13 people looking at hackathons,
00:16:51
Speaker
And you said you had a product manager and a developer or two. What was it that you were building? This was like a WordPress site that you was... So the blog was on WordPress, but for example, this entire Hackathon platform was built in-house, right? So people could upload their models. It would evaluate it in real time and show them what is their rank on the leaderboard.
00:17:14
Speaker
People could discussion so that was all built in house and we were still going around with the product ideas, right? So for example, we were looking at integrating a coding window into our platform or you are also looking at building a mobile app.
00:17:31
Speaker
more evolved version of discussion portal so there were these so we always wanted to be more private focused as opposed to a media business but it would take time before you would read what is the coding language that is used by data science professionals.
00:17:53
Speaker
Today it's largely Python, but back in 2014, right? So that in itself has gone through its journey, right? So 2014 SAS, which is a proprietary software used to be the most commonly used language at that time.
00:18:09
Speaker
Then, you know, people started exploring open source and for a couple of years R looked like the main language which people would use. And I think late 2015 onwards, Python started capturing a lot more because it's production ready. The ecosystem is much bigger.
00:18:31
Speaker
And today, I think it's hands down Python.

Python's Dominance in Data Science

00:18:35
Speaker
So almost 50, 60% of the data scientists would by default use Python in their day to day tasks. And Python is an open source database tool.
00:18:47
Speaker
Python is an open source programming language. And it has a whole set of libraries which are open source. So I want to run a statistical analysis. I can just add that library and run that code. I want to build a machine learning model. There is a library called scikit-learn. I can import that, build my model, and be good with it. Compared to, for example, with SAS, you would
00:19:12
Speaker
pay the license fee and then server fee and all those things. So that democratization happened during that period. So in fact, when I joined Aviva, the kind of costs we incurred in setting up the team,
00:19:30
Speaker
they almost went to zero by 2014. So the only cost you would pay in setting up a data science team was essentially the hardware and the payroll. Those were the only two costs left. So that democratization happened during that period. And I think a large part of our growth was also fueled by that, that people wanted to learn more than people wanted to
00:19:55
Speaker
no more and that's where the community came in very handy and in a place where people could spend time without any restrictions and that's what yield our growth at that time. And then as you said, at that time the business model was largely media driven and we were close to breaking one at all points or breaking A1.
00:20:19
Speaker
But the intent was always to take that and reinvest that into businesses. So I think in the first two years, we did not make profit, although we always had side to profitability. The intent was to grow the community as much as possible. That's what you are doing. And what revenue are you doing by 2018?
00:20:39
Speaker
Ship 2018, we would be roughly half a million dollars. Ship 2017-18 would be roughly half a million dollars at that time. About 3-4 CR. And this was, what was the split here? Like how much from advertisement, how much from event fees and
00:21:04
Speaker
Largely advertisement driven and even 2017 when we did was about 30% of the revenue. So it quickly was very evident that this could be very profitable business in its own.
00:21:20
Speaker
2018, when we launched courses in the first year, we almost had three almost equal revenue lines. So courses gave us B2C monetization, Hackathon and Hiring was also clean, and then the community events. So, so we ran for a few years in that model that doing
00:21:41
Speaker
This courses that you launched, like how intensive were they?

Focus on Practical Machine Learning Skills

00:21:46
Speaker
And they tell me about the journey of the courses. Right. So courses were largely self-driven, but with very application oriented and industry relevant examples, right?
00:22:00
Speaker
So this was the time when, for example, Coursera was probably the first place where people would learn. And the biggest problem with Coursera was that it came across as very academic. So while you would learn the theory behind machine learning,
00:22:17
Speaker
you would not be able to take that and apply it to solve problems. And we were immediately seeing a lot of people struggling with even building basic models in hackathons. So people would come by doing these courses. And the minute they'll see an unstructured problem out there, they'll struggle. So the thought was that, can we create these courses which help people become more industry ready?
00:22:46
Speaker
So just to give an example, when we were teaching machine learning, we didn't say that this is the equation behind these models and this is how you build it. We said, okay, let's take an example that you want to predict house prices in an area.
00:23:02
Speaker
how would you do it? Assuming you have data of passes. So the first logical way is you would say that I'll say that it's an average. So that's your first model. So just taking that fear out of machine learning and saying that if you could say that this average is my first prediction, this is your first model.
00:23:21
Speaker
Now, if you have to make it more specific, what would you do? So you could introduce some sort of segmentation. In this case, it could be, for example, geography. So where is that house placed? And the average area. And then you could introduce more variables. How many bedrooms are there, et cetera.
00:23:39
Speaker
So all of a sudden, you're not approaching it from theory first. You're taking that application. And then while you're doing this, you'll introduce the challenges, that you want to capture how many bedrooms are there in each apartment, but you don't have this data. So how do you capture that? And then, so our courses were very application-oriented.
00:24:04
Speaker
So how did the application part of it happen through self-paced manner like people would like you had some way for people to upload their work that they do on Python which would get using the hackathon platform which we had put right so so in between the courses we would say that okay here is a problem which will now work
00:24:23
Speaker
And you can go to this URL, upload your solution and see how good or bad you were. So that's where, in fact, a lot of experience of running these hackathons, the kind of problems people face when they come to these hackathons for the first time gave a lot of insight in terms of how we should go about building our real courses.
00:24:45
Speaker
Amazing. And what is the file? So like Python generates a file. I'm not from a coding background, so I don't...
00:24:55
Speaker
can upload a Python script or you could upload a model as well. So you could say that here are my predictions for each of these cases, go and validate how good or bad they are. So both the options were there. The preference obviously was for people to upload their Python script because then you can truly validate them.
00:25:17
Speaker
So that python script would be run on the data because the data you are only providing so you would run that script on the data and then validate it against some ideal response and through software flag what are the gaps.
00:25:33
Speaker
Exactly. And data also just no answer data itself was also broken into two parts. So one was visible publicly, right? So when people upload their solution, they need some indication of how good their model is. So that was on this public part of the data. And then there was part which was held from public, which was private data.
00:25:57
Speaker
And the actual rankings finally was on the private data. So you're not allowed to see the data on which your final model will be evaluated. And that's how this entire solution was constructed. OK, so like, for example, it would be a data set with 10,000 rows. You would only share 1,000 rows with them, but you would run the script
00:26:19
Speaker
run the 10,000 rows. Exactly. You would only see the model working on a part of data. And again, depending on the data, the actual split could vary. So for example, if it is a time-based data. So then what you would say is that I'll open the data for 2018, and I'll evaluate you on the outcome I saw in 2019.
00:26:43
Speaker
So use all the insights you need from 2018 but 2019 which you are not seeing is what would be used to evaluate it. So essentially make it as close to real life as possible because these are exactly the challenges with these data professionals face on a day to day basis.
00:27:02
Speaker
and that's and give me if I had make it exciting make it essentially a place where community would love to spend their weekend doing this learning. And what were these courses price set and typically how long did people take to finish a course and what all courses did you launch?
00:27:22
Speaker
So the first course which we launched, that was launched at about 4,500 rupees price point. It would take anywhere between 3 to 6 months as long as you are spending, let's say 4-5 hours a week. And so that was the first course which we launched, which was essentially Introduction to Data Science.
00:27:43
Speaker
So building all of those grounds up. And from there on we started adding more and more courses. So there were courses on visualization, there were courses on advanced machine learning, there were courses on natural language processing, computer vision, and a lot of these advanced topics.
00:28:03
Speaker
So in about two years, we went from one courses to almost 30 odd courses. So an average price. Average price. So with time, we continued to bring the prices up. And what we also started doing was we started bundling these courses. So because from a user perspective, they would want to go in a logical flow.
00:28:31
Speaker
So the average price point, if you buy a specific course would still be around 5000 rupees, but these bundles would cost you anywhere between 15,000 to 20,000 rupees at that time. And yeah, this was the time where we were running the business in a very bootstrap manner. So for example, we would launch the courses first.
00:28:53
Speaker
collect the revenue and then go and build them. So we would pre-launch the course. So a lot of those hacks to just, you know, and then it was a brilliant way to see how much the demand is out there.
00:29:09
Speaker
And you could accordingly either accelerate it or say that we are missing something. How do we iterate from there? And then in 2020, we launched essentially a program called as Black Belt, which is our subscription to all analytics with their courses, along with one-on-one mentoring.

The Black Belt Subscription Program

00:29:31
Speaker
So the day you take this program, you get subscription to all the courses.
00:29:36
Speaker
there is a mentor who would get on a call with you, explain which course to take when, customize it to your needs, and that mentor would be in touch with you throughout this journey. So we then moved away from selling these individual courses, and we said this is the only product which we are selling.
00:29:58
Speaker
And what is the price of Black Belt course? Black Belt today costs about 65,000 rupees or 1000 dollars, depending on what you're using. But when we launched it, at that time it was about 40,000 and then we added more courses, content, services on top of it.
00:30:18
Speaker
And same time, so 2020, 2021, we launched another program called Bootcamp, which was essentially aimed for people in zero to five years bracket. And the intent was that each person who comes into this program should be able to get a job in industry. So it was a job guaranteed program and a selection based program. And in fact, first batch we
00:30:46
Speaker
It was self-learning or this was like a cohort based? This was a cohort based instructor led. And in fact, the first batch was actually aimed to be a physical program. And it was supposed to go live on 1st April 2020.
00:31:11
Speaker
So then we moved it entirely online. We launched the program in July. So that's the second kind of big program which we have where we select the students who come into the program and then every student who comes in, it's our responsibility to make sure that they get placed in the industry by the time they finish their program and then go through the coursework.
00:31:36
Speaker
I have a bunch of questions I want to ask you so these modules so initially you were selling these modules which were like focus on one one topic did you see that the completion rates were low etc because of which you evolved it and stopped selling those like what was the reason to stop selling modules.
00:31:57
Speaker
So it was two things. So modules individually, so actually to a large degree, it was driven by the fact that we were bootstrapped. So at any point, you only have limited bandwidth to create these courses. So we continued to add, but once a course is created, you can keep selling it.
00:32:19
Speaker
So the bundles there, we are still using the same content, but then essentially it's a call of how much ownership we want to take in the journey of the learner. And we said that instead of looking at those as point in time, let's own the entire journey, give as much guidance as possible.
00:32:38
Speaker
So that was one reason. And completion rates, so in general, any self-paced course would have far lower completion rates. But our completion rates were still, from an industry benchmarking perspective, very high. So almost 15% to 20% of people who were taking these courses were completing these courses, which is a lot higher than what anyone else had in industry at that time.
00:33:07
Speaker
So that was not the reason. The reason was that ultimately we want to create an impact on the life of Ladna. And the best way to do it is say that for the next two years, we'll work with you on a one-on-one basis and as close as a manner as possible and create an impact. So that's why we moved away from selling these individual courses.
00:33:30
Speaker
Also, I think the other factor was that ultimately the effort required to sell a module versus an ad program was, I mean, the differential is not very high. And then the learners also wanted more. So it was just a good win-win solution for everyone involved. That's why we said that we would just focus on programs.
00:33:55
Speaker
So today, nobody talks of self-paced courses. I guess that is fundamentally because these courses put you in a race to the bottom. If you can create, then other people can also create. And eventually, the differentiation starts going down. The completion rates at 15% is not ideal to sell something which only 15% of your users fully benefit from.
00:34:22
Speaker
So I guess that's why the industry as a whole has moved away from these self based courses into more cohort based courses.
00:34:30
Speaker
So it's a fairly fast evolving industry. So for example, Coursera started this in a lot of ways, the self-paced revolution in that sense. And then people said, OK, this is good, but the completion rates are low. So then people started adding more things. So mentorship, for example, is a great value add.
00:34:53
Speaker
or peer learning could be another such learn. So people continue to do that. But I think this question that ultimately how much is the impact in life of learner is what is driving this industry. So there I think the learning you go through in a cohort based is fundamentally a lot better.
00:35:15
Speaker
However, I think the dry balance sits somewhere in between, because they're specifically, for example, in our case, there are a whole lot of learners who can't come at that time on an ongoing basis. So they can't say that every Wednesday and every Friday, I'll take this course from 7 PM to 9 PM.
00:35:36
Speaker
or for the next six months, I'll devote every weekend to learning this. And that's where self-paged are still brilliant ways to address that need. So my sense is that you would ultimately have a mix of the two modes, which would be the best way going forward.
00:35:59
Speaker
And for a lot of corporate clients, self-paced still makes a lot of sense because they can't say that the entire team would go and attend this training program for next five days or seven days from a business perspective. So then I think what is clear is individually self-paced can't make that much of impact.
00:36:22
Speaker
But at the same time, not everyone can commit to a complete cohort driven program. So a sweet spot is probably somewhere in between and matching the user need to the right product. So a student coming out of college is best suited for an instructor led program. A working professional with varying demands of work is probably more suited to a flexible learning curriculum.
00:36:48
Speaker
So your flexible curriculum is Black Belt where there is an element of an instructor in the form of the mentor and yet it is self-paced. And then the mentor works very closely with saying, okay, for the next 15 days, how much time are you, let's say, committing.
00:37:07
Speaker
So let's put these milestones in next 15 days, you'll be able to do this. So it's self-paced, but it is very closely working with these mentors to make sure, because again, if you leave it as completely self-paced, you'll not see the impact. So the mentor has to go back. So for example, let's say a mentor agreed on this 15-day milestone. On 15th day, the mentor would shoot a mail saying,
00:37:36
Speaker
Hi, how are you doing? Do you want to connect back? So there is that again, nudge or trigger to the back end. So you're bringing in accountability through the mentors. These mentors are on your payroll. What kind of people are there?
00:37:56
Speaker
It's again a mix. So initially, all the mentors were on payroll, but over time, and these are like data science professionals. Data science professionals, correct. And then over time, we have expanded it to our community members, right? So the brilliant thing with analytics with this, because it's ultimately a community platform,
00:38:18
Speaker
you could tap into a lot of these experts and they are actually very excited to come back and contribute.

Content Creation Through Community Involvement

00:38:26
Speaker
So we can tap into the top data scientists in the country, we can tap into the top thought leaders and then they can come and share their learning. So today
00:38:37
Speaker
I think almost 70-80% of these mentorship calls are taken by community members. And obviously there is onboarding involved in it. So when a community member says they want to do it, they undergo three month hand-holding as a mentor, right? So what are the products? What are the best ways to do it? And they themselves go through some of these shadow calls, but by end of month three, they'll start operating independently as a mentor.
00:39:06
Speaker
Okay, interesting. And they are paid for every student they mentor. Interesting. Okay. So you've created that mentor onboarding also as a module, like that would be a training module in your system for a mentor to get onboarded. Exactly. And not only mentor onboarding. So, you know, during this entire journey, right, we have always been
00:39:30
Speaker
very focused on community. So for example, till 2019, all the content on the site was getting published by analytics with their content writers and everyone was writing blogs today.
00:39:47
Speaker
100% of the publishing is through community-contributed articles. And the volume has gone up 15 times. Because in-house, we would publish, let's say, four articles a week. Today, we are publishing 150 articles a month.
00:40:06
Speaker
And the community is sharing their knowledge. So you not only tap into a larger pool. You get very diverse opinions. You get very diverse data sets, very diverse needs. So a professional sitting in Africa, seeing a problem in their, let's say, banking industry, coming through article, which you just can't do a house. So that's where the power of community comes into play. We are doing the same thing.
00:40:35
Speaker
on webinar so community members can come today and float what we call as data. But you must have some quality control mechanism because
00:40:45
Speaker
Quality is entirely controlled by analytics with this. So every article gets moderated. So for example, for these 150 articles, we get about 350 submissions. And with each of these submissions, there is an editorial team which works with them. And for these writers, the feedback which they get is extremely valuable. So they get this feedback that how to improve their articles.
00:41:09
Speaker
And now we are doing the same thing on the video side, right? So community members can come, and they can share their wish to float a webinar, which we call as DeFi. And then the team would work with them, go through, and the minute they are at a stage where it looks the right thing for the community, that we floated to the community members.
00:41:31
Speaker
And this becomes a pool to see who is expert in what areas, what is their skill set. And then that gives us a way to bring these people onto our programs or let's say if there is a corporate client looking for trainers in a specific domain, this becomes hard to essentially get those people. Okay, interesting. So 2020, 31st March, what revenue did you end at? Like you were doing 34 CR in 2018. How much did you do?

Impact of the Pandemic on Learning Models

00:42:01
Speaker
2020, we had reached about 8 crore in Gavinya. And this, how much of this was the ETC revenue? So 2020, FY 2020, as I said, was almost three equal verticals, right? So there was a community. 30-35% ads and 30-35% through events and hiring.
00:42:26
Speaker
Correct, correct. And then post that we have obviously continued to grow and then Covid changed the mix. So overnight your community events went from paid to free because they were all online now, no in-person events. Hiring almost... But events you anyway were doing online only, right?
00:42:52
Speaker
No, even the monetization, the paid events were essentially offline. So the confidence was offline. And so that went away. Similarly, hiring, there was this period of uncertainty in COVID that how would it pan out? So there were hiring fees, etc. On the other end, training demand shot up
00:43:16
Speaker
So overnight, there were a lot of people wanting to spend their lockdowns learning new skills. So we had a huge bump in the traffic on a learning page. So 2021 in that sense was a complete distribution reshift. And then for that year, we were essentially a tech company in that sense.
00:43:40
Speaker
And then, yeah, and then now again, we are building a mix of hiring and community events into the community. What's your current error?
00:43:53
Speaker
So we are at a run rate of about one and a half to two crores a month in that sense. So one part of your revenue, which is the B2C at-tech revenue, let's just go through your revenue stream. So this B2C at-tech revenue, is this self-service sales or do you have like a sales team which drives sales here? What is the go-to market here?
00:44:20
Speaker
Yeah, it is assisted sales. So people can come and obviously buy these products of the shelf, which is what was happening till a few years back. But then we saw a lot of drop-offs happening in the journey. So today it is assisted sales. So the minute you show interest on the platform,
00:44:40
Speaker
Councillor would reach out to you, explain the product, take away confusion and guide you through the process. So on the B2C side, it's assisted sales for all technical.
00:44:52
Speaker
Okay. And you have like corporate revenue also like from corporate training and how does that happen? So that happens in a very, it's a corporate thing, intent is to create these extremely customized programs for their needs. So for example, one of the clients we are working with, they take about 50 graduates every year.
00:45:18
Speaker
And for those graduates, for the first six months, we train them on these technical skills, projects, et cetera. Month six to month nine, there is a hand holding between the client and analytics there where we say, so these people are working on the client's project.
00:45:37
Speaker
and we are assisting them and one nine onwards they are in the roles at the client center. So highly customized program understanding what is the need and then delivering those in whatever is the right way. So for example in this case because people go through this curriculum at the same time it's cover driven.
00:45:58
Speaker
But on the other end, there are clients where it cannot be code-driven. So there is a mix of self-paced, less doubt-solving, and then hackathons and workshops built into the program. You work with the trainers from your community, or you have on-role trainers for the corporate? OK. OK.
00:46:16
Speaker
So usually most of the delivery which happens during the weekdays is done by in-house instructors because people would have their own commitments, but it's a mix. There are community members who are actually doing this as a full-time thing for us, but they're acting as freelancers. So it's a mix. And then each of these clients have these specific program we have created, which we deliver into and for them.
00:46:46
Speaker
And this would be like a relationship basis like you would have like a sales person who's going building relationships finding out if they need.
00:46:55
Speaker
So interestingly, this is one of the places where people don't realize the power of community. So most of the B2B business, which we had built till last year, was actually inbound interest from community members. So a person looking at their clients and having experienced analytics with them comes back and says, here is a program we want to launch in the company.
00:47:23
Speaker
can we partner and can you guys run this end to end? And what that enables us, for example, even for these clients, having the ability to tap in and bring in, for example, we can bring the top data scientists and would come and do these information sessions for the clients.
00:47:43
Speaker
So that R is just immense to create these customized programs. So we haven't been very B2B sales company in that sense. In fact, R and B2B sales. You don't do too much outbound. Most of your sales is inbound.
00:48:01
Speaker
Yeah, there is there is one, one person B2B sales team we have today. And most of his time would be spent with inbound. So while the play is relationship based, but the discovery, etc, is all inbound.

Community-Driven B2B Sales

00:48:21
Speaker
And so yeah, you know, with community, that's the big thing that you don't realize that
00:48:28
Speaker
And even for your B2C courses, it would be community. I mean, your top of the funnel would be your community. Or do you also spend on performance marketing? With large businesses through community, we continue to experiment on performance marketing. But what we have seen is that it's nowhere as scalable and as efficient or profit-making compared to what we can do from community.
00:48:54
Speaker
So the intent is to continue doubling down on community as opposed to looking at performance marketing engines. Amazing. Okay. Okay. And okay. So one is corporate training revenue, B2C, edtech revenue, then hiring revenue. What is the way you do that? You charge like a per hire or you charge for like, what is the model there?
00:49:17
Speaker
What we do is we talk to the companies who are hiring, and we would conduct these hackathons in competitions, which we call as jobathons. So a typical jobathon, about 20 companies would participate. People would get evaluated on their skills, and they would need to solve a problem as well.
00:49:38
Speaker
And based on that, they'd come on a leaderboard. And these profiles would then be shared with these companies who are hiring based on a match. So you built an algorithm at the back, which matches that. And finally, when they hire is when they pay us. So it's a success based because, again, the intent is to bring the best opportunities for the community. And then any friction you introduce in that process ends up bringing the service to both the sides.
00:50:04
Speaker
And so you charge like a hiring agency, like a one month salary kind of a with like a that regular term like a replacement and all of those things like typically hiring agencies have those kind of terms.
00:50:18
Speaker
Right. So it's not as similar to it, but the intent is that. So for example, replacement clauses, et cetera, are not there. But at the same time, for example, if a person didn't join, then yes, we would obviously. So in that sense, it's not a hiring agency. So we actually tried a few different models before we zeroed in on this. But there was, I mean, the industry works in a particular way.
00:50:48
Speaker
Yeah, this is easy to get a buy-in. Otherwise, it would be very hard to get a buy-in for anything else. Exactly. So we said that let's remove any friction in the way for HR to hire. OK, interesting. So what about ad revenue? Do edtech still advertise? Because now you're in edtech yourself. So there is that.
00:51:15
Speaker
So yeah, so ads we discontinued two years back. So we stopped running ads on the site. So now instead of selling other people's products, you're selling your own products, which is much higher earning. You're selling your own products.
00:51:32
Speaker
Well, that's when we have moved away from media model today. It's essentially a community portal where you are essentially trying to add as much value and address a lot of these needs of community members through your own products and through your own network, right? And I mean, directionally, right? For example, in future, the way we see it is all of these people who are contributing back.
00:51:59
Speaker
would become creators on analytics. So we would want to enable people who are, let's say there is a data scientist working at a large bank who wants to offer a course on credit risk modeling.
00:52:14
Speaker
By all means, we should be able to enable that. So the direction is that he would want to have close control on the content, but bring out that expertise which sits in the community and to build on that.
00:52:31
Speaker
Okay. Okay. Amazing. Amazing. So these are essentially your revenue sources now, hiring and corporate trading and a tech like the B2C a tech. Okay. You also raised your first institutional round. I think last year you raised it, right? Tell me about that.
00:52:51
Speaker
So, we raised this round from Fractal and Fractal is again a very domain focused company. So, in that sense, it is exactly the domain we want to be in.
00:53:05
Speaker
And it's a company where people understand community. So just for people who don't know what is Fractal. So Fractal is India's largest analytics and AI services company. Works extremely focused on Fortune 100 clients.
00:53:25
Speaker
high value engagements, very tightly net engagements. And the aim is to solve problems for these Fortune 100 companies. So they are about 4,000 people company today. And again, growing very fast. So Fractal won't be doing what, say, McKinsey, Bain, BCG do, except they won't be doing it through the lens of data.
00:53:47
Speaker
Exactly. And so data behavioral design is the area where they focus on behavioral design. What is this? I've never heard of this. So the way consumers behave is very different from what you would think logically. So there are these inherent biases which come in. So the unique proposition for fractal is that so obviously they bring in the data expertise, but they also understand these
00:54:15
Speaker
biases and behavioral studies which go into creating these products or services. So, the UI, UX design or how to go about solving those. So, it's a mix of these dimensions which fractal brings on table.
00:54:32
Speaker
And so for us, it makes tremendous sense to have, you know, someone like fractal backers. So, so they understand the domain, they understand the community business, right? So, why do they understand community business? They are a services business, right?
00:54:51
Speaker
They are services business, but the leadership team there understands the power of community. In fact, 2014, when I was, this was the time when I was writing the article. So I reached out to Shri Kanthu, CEO of Fractal at that time for an interview.
00:55:07
Speaker
And he was the first business leader to agree to doing the interview on the portal. So since then, I've always used him as a mentor. At any point, if I face a conflict, for example, the media model versus community model, he was the person I was brainstorming with.
00:55:31
Speaker
So in that sense, the community aspect fractals just get certain and in fact, since the time we have raised the strategic investments, our focus on community has actually increased a lot more because everyone understands the power which it brings in.
00:55:50
Speaker
And that's why it's not a, so while it's obviously institutional investor, it is someone who understands the domain, understands the community, and it brings in a lot of domain expertise, right? So the problems with these Fortune 100 companies are facing that experience of solving those problems.

Investment from Fractal and Industry Standards

00:56:11
Speaker
Where is the industry heading? What are the kind of challenges people would expect in coming years? All of those can come through their expertise essentially.
00:56:21
Speaker
Okay, fascinating. They would probably also see this as a way to increase the pool of data professionals and data talent in India, which also ties into their business growth. Exactly. From a practice perspective, there is this talent problem which is there. Talent in employer branding to some extent.
00:56:43
Speaker
more importantly, they are actually committed to creating a standard for talent, right? So if you think about it, there is no equivalent of CFA in analytics and data science today, right? So tell is committed to creating that standard and that's like globally recognize the qualifications.
00:57:10
Speaker
So that's the second area. And then what we are also doing with fractals is taking some of these offerings to fractals clients. So the B2B expertise and programs which we create can run for fractals clients as well. So there is tremendous synergies which can be unlocked. And that's what we are doing through this.
00:57:35
Speaker
Amazing. Amazing. Okay. So on the product front, you built like one product was like the WordPress blog. Then there was a hackathon product, which allowed you to share a problem and for people to upload. And then that hackathon product led to the modules product. What else has happened in terms of product evolution? So LMS is another part which has been an LMS is learning management system. It's basically like a way to manage your learner's journey.
00:58:05
Speaker
That thing might have caused me exactly right so and.
00:58:13
Speaker
Across all of this, all the information about the user, so their skills, for example. So we have this evaluation engine and a certification platform, which has enabled security features. So for example, in the job returns, when a person is participating, there are security features enabled onto the platform that while you are taking the test, you cannot copy paste anything from the window.
00:58:39
Speaker
There is monitoring happening and rupturing happening. So this entire evaluation platform. And a 360 degree view of these learners. So people are working and then engaging on analytics across all these parts. They are reading articles. They are attending webinars. They are participating in these competitions. They are probably undergoing a few courses. So we are building this 360 degree view of these learners in the back end.
00:59:09
Speaker
And what we are today building is how do we enable this creator economy onto the platform where people can come and while they are sharing their knowledge, they can also start getting this back from the community. And so it's in that sense, building this entirely new ecosystem where people are contributing back, but in the process, everyone is benefiting.
00:59:37
Speaker
So essentially these UGC articles, like you said, you publish 150 articles a month, which are UGC, user-generated content. So the users who submit these articles, what are they getting back?
00:59:50
Speaker
So there are a couple of things, right? So the biggest thing is that they are getting their profile and expertise. So they are positioning them. So that's wrong. And then there is a monetary angle to it. So there is, so the way we have defined it is there is a base reward. So every time your article gets published, you get a base reward. And then depending on how good the article does, you can make more rewards on it. So, so in that sense,
01:00:20
Speaker
This is a cash reward. So again, so the idea is, you know, you have to enable this creator economy. And for that to happen, you have to do it at scale, right? So initially, for example, people were contributing purely because of social proof. And then they had learned from analytics with there. So they were
01:00:41
Speaker
coming back and contributing. But then you can't scale it as much. And that's where this entire engine for creating these articles came in. So again, just to share some stats, today if you
01:00:57
Speaker
publish a technical article on Analytics with you versus you publish it on Medium or LinkedIn. Analytics with you would give you much higher visibility than what a LinkedIn does or what a Medium does. So that's the key differentiator. So you're in the domain.
01:01:19
Speaker
Viewers are very relevant. You're getting position in front of people. So the visibility which analytics is providing is a lot higher than what you would get if you publish this elsewhere. And the idea is that if we could replicate the same thing for videos, for example, if people creating these technical videos get higher visibility than what they would get on a YouTube, then you essentially crack the problem.
01:01:48
Speaker
So you have a video platform also. I thought your video platform was webinars. You're also like UGC videos. So we are building that. So that's what we are building today. So that's where community members can come and contribute back. And everyone in the process is benefiting. And so throughout these platforms, based on how many views it gets, people will get rewarded.
01:02:16
Speaker
Exactly. It could be views, it could be, you know, engagement metrics, for example, how many people have engaged on the articles you've written. Okay, so analytics video is like a full-fledged social network. Like you can watch videos, you can read articles, you can post something, like you can have like user-generated posts also which don't need an approval or you can start a thread on something or like a
01:02:47
Speaker
So there is approval for any publishing you need. For any comments or questions, it gets published. And there is obviously an algorithm at the back. So whenever it sees any fishy, obviously it puts in the moderation queue. But by and large, as long as you're doing technical discussions, you should be able to ask those questions, get answers without any moderation. But if you have to publish anything, so if you have to publish an article or if you have to
01:03:17
Speaker
publisher webinar, then there is a moderation queue which is there. Why not go full-fledged UGC? Just not have an approval mechanism. Just allow anyone who wants to publish to just start publishing. Anyone who wants to upload a video can just upload it, like any social network.
01:03:37
Speaker
Yeah, so again, brilliant question. We have asked this multiple times ourselves. The thing is that in Analytics with there's growth, right? One of the key reasons why people come back is that this quality of content and they see Analytics with there has this ability to take complex topics and then break it down into small, easy to digest pieces, right?
01:04:03
Speaker
So we want to retain that in the content which we create. So for each of these articles, that's where a lot of work happens with these contributors. So if someone, let's say, just publishes the theoretical construct in form of an article, we'll say that we'll not be able to publish it.
01:04:22
Speaker
Here are the changes we would want you to make. How can a learner benefit from it? What are the three things they can take away from it? I understand you're saying you want to control quality of articles, but this can be solved in this way. Twitter has a blue tick, so you could have a blue tick on articles which are
01:04:43
Speaker
Analytics with the approved articles and in general, have anyone publish and the data will tell you which article is a quality article. You can just see that which article is getting more engagement, which means it is a quality article, which means it gets shown to more people like you could solve it like that.
01:05:00
Speaker
Correct. It could be solved like that. And then, I mean, there are different ways, right? So this is one. The other one could be that you say that there are open community articles. Another one is a feature list. So the best articles, you put it on the feature list. Or now we said that let's make sure that we are doing it in a slightly more controlled way. We might open it up as we grow, right? It's something which we continuously evaluate. And as long as the intent is that as long as anything which is getting published,
01:05:30
Speaker
is of value, we would want to enable it. We just don't want people to come in and republish a lot of content which is either plagiarized. So plagiarism is another problem which gets filtered in this process. So that's the key reason. But depending on how, so we want to definitely open it up, but in a controlled manner is what we're doing here.
01:05:55
Speaker
Got it. Okay. You spoke about CFA for analytics. Is that something which is on your roadmap to create that kind of a globally accepted qualification?
01:06:07
Speaker
Correct. And then we are working with Snactle closely on this to build this, but that's the intent that can we have globally recognized certification, which essentially becomes a proof in itself, a gold standard for talent. And how would that work? Would it just be a series of exams or would it be a course that you have to take or how would it work?
01:06:31
Speaker
Right, so a lot of that is under works right now. For now it would be a curriculum or a standard defined and a program which people can opt for. So people can go through the journey but people can independently also just sign up for the exam and then say that I will want to undergo this certification.
01:06:54
Speaker
So, but what we are clear is it has to be offered together. So instead of just saying that here is the certification, there is a way to get there for people. That's the way we are approaching it. Okay. Okay. So this year you're likely to end up 20 CR type of top line. Will that be like, will you be like profitable with that number or like, what is your path to profitability looking like?
01:07:22
Speaker
So fundamentally the business which we have, it's fundamentally a profitable business. Last three years we were profitable. Because your customer acquisition cost is very low. You don't have to burn money to get customers. This year we are looking at running some of these experiments. We are building these platforms.
01:07:42
Speaker
So this year we would not aim for profitability in that sense. Having said that, none of the fundamentals change. So at any point we can steer the business towards profitability. What is the way to make this 20 CR as 100 CR or 500 CR? What do you see are those things that you need to do? Do you need to go global? Do you need to have more expensive courses also? Talk to me about that.
01:08:10
Speaker
So, I think the biggest thing is to continue to become the single platform for any knowledge, need, which a data professional would have. So, the aim is that every data professional should be there on the portal doing their learning either through open learning. So, that's the biggest ask and my
01:08:35
Speaker
fundamental belief is if that happens, we will be able to monetize and then grow the platform like we've been doing till now. So that's the key focus that keep investing back in the community, bring these products which are high value specific, address specific needs.
01:08:53
Speaker
and continue to build business. A few levers which we are actively investing in. So today, for example, most of the content is around data science. Last few years, we've started building content on data engineering. We have started building courses on Web 3 recently. So the idea is that any need which a data professional might have
01:09:18
Speaker
should get suffice to the platform. And then as we do it, I'm sure the revenues would take care of themselves in that sense. Okay. Okay. Okay. Got it. Data science has a field. We discussed this briefly about its evolution. I remember there was a time when the big buzzword was big data.
01:09:36
Speaker
And today it is machine learning. What is the difference between these terms? Like for an outsider, what is the difference between data science, data engineering, machine learning, AI? Because these terms are very often used interchangeably. Help me understand technically what is the difference between them. Sure. So let me explain through very simple analogy, right? So let's say you are working in an automobile company.
01:10:05
Speaker
Now, if you think about how that business is organized, there would be these specific career tracks or there would be these specific profiles. So there would be a scientist sitting in a lab optimizing the engine. So the problem that person would be working on is how can I make this engine more efficient.
01:10:26
Speaker
Can I get 20% more mileage or can I increase the torque further? So that's the problem which a scientist is working on, very specific problem. And you don't need an army of scientists. You need a few PhDs or a few people who can really tackle that problem. So that's one profile. And that is in our analogy, that is data science, the equivalent
01:10:52
Speaker
That is data scientist sitting in an organization working on a very specific problem. And the work ends when there is a POC which has been created. So the person is not expected to put it in production.
01:11:07
Speaker
On the other hand, there would be these engineers who are working in the plant, in the engineering unit and what they would do is take these findings which the scientists have done and put them in production. So, how should the assembly line look?
01:11:24
Speaker
who would do what, all of that is being handled by engineers. So in similar analogy, data engineers essentially build these pipelines. So how is the data flowing from different systems? Which data would come at what frequency? So for example, Uber needs it in real time, bank needs it on a daily basis. So every data source would have their own needs.
01:11:49
Speaker
So that's what a data engineer would do. So collate all of these different sources, make that ready for this algorithm, which the data scientist has built, put it in production and maintain that on a day-to-day basis. So those are your data engineers.
01:12:06
Speaker
And then you have a layer of management who is taking decisions. So how many units do we want to produce? And what is the cost of raw material? What is the ROI like? So these are your business analysts in our analogy. So people who are working closely with business,
01:12:26
Speaker
and using data to solve problems like what I was doing at Capital One. So those are the essentially career tracks, a data scientist, data engineer, and a business analyst.
01:12:38
Speaker
Now, there are tools. So machine learning is a tool, right? So who is using it? So obviously data scientist needs to know machine learning inside out. A data engineer needs to understand what machine learning does and put it in production. So they may not know the algorithm which brings in that next 20% uplift, but they should understand what this algorithm does.
01:13:03
Speaker
And for a business analyst today, a lot of time is going in analysis as opposed to machine learning.
01:13:11
Speaker
Machine learning is different from algorithms. What is the difference between them? I mean, machine learning is core AI. Often people would say we are an AI-powered product, but they are actually an algorithm-powered product. And I'm just saying this because I've heard other people talk about it. I personally don't know the difference between these two.
01:13:33
Speaker
Yeah, so machine learning is essentially the set of techniques which enable machines to become smarter. So they are booking logic, which is done through these algorithms. So an algorithm which is working on a specific problem today cannot be taken to another problem and expected to work. So you need people who are tuning or matching these algorithms to the right problems.
01:14:01
Speaker
And what you're essentially trying to build is intelligence or artificial intelligence. So machine learning is the way to build artificial intelligence and algorithms are essentially enabling it. Is this understanding correct that an algorithm is somewhat something written based on our understanding of the world, whereas in machine learning
01:14:22
Speaker
You are feeding machine data and asking it to understand the world. So, algorithm has both these classes. There could be algorithms which you have defined or there could be algorithms which it has learned from data and machine learning is essentially learning from data and evolving it. So, for example, today when you do Google searches, there is no one at the back sitting and fine teaming that algorithm.
01:14:47
Speaker
So it is learning from the searches and evolving the results by itself. That's machine learning. So if I run a search and I choose a result on page two, then the machine knows that page one results were not great and the page two result was better. And then it will look at similar search searches and draw some sort of analysis and what is the most relevant result.
01:15:13
Speaker
We're not great. Exactly. And that's machine learning in action. On the other end, assuming you are not at that scale, if you are looking at an analyst to see what products are our customers finding useful on the website, that's simple analysis.
01:15:33
Speaker
And you could use that in a rules-driven manner. So for example, today I can say that most of the people coming to analytics are there want to learn machine learning. So I'll show that in the first form.
01:15:46
Speaker
Right. So I have to find the algorithm, how the page comes. Right. That's one way, but I could also say that I don't want to do it. I am just throwing all the content in it and it by itself brings the most relevant things up. And when you build that again, that's machine. Almost every social media platform would have their own recommendation. In fact, that's the make or break thing, right? Because as a user, you would have a few seconds of attention.
01:16:16
Speaker
And if you can grab that, and that's where I think YouTube has done this brilliantly. Once you start watching a video, most of the people would find it very difficult to step out of it. So on the edtech side, you are essentially competing with a great learning upgrade. And these are all multi-billion dollar valuation companies. So what makes you think that you can win?
01:16:44
Speaker
because they are also offering data science courses and machine learning courses and all. It's a lucrative field.
01:16:51
Speaker
Correct. So fundamentally, we don't see ourselves as a pure egg tick player. So we are actually not in a war to win against these people in that sense. What we are building is a community platform focused on data professionals. And any need these customers might have is what we need to serve. Today, education is one of the need. Tomorrow, for example, one of the products which you have built today is a way to
01:17:21
Speaker
build data engineering pipelines on the browser. So we are using it to run competitions that people can write code and build these pipelines in the browser. So in that sense, what we are building is not a tech company. So in that sense, it's not right to compare against them. Yes, today a lot of revenue comes from education as offering.
01:17:47
Speaker
They are also on value proposition is very different, right? We are building these very application oriented, outcome driven programs, which help. So we are not in a business of providing college degrees or bringing degrees. That's the fundamental difference, right? So and that's where, for example, our choice of looking at investor was fundamentally different. So far as success is not
01:18:14
Speaker
the etic revenue we make. The success is how many people are we able to make a career impact on and how many data professionals across the globe are on analytics if there is a plan. So a better comparison would be GitHub then for you. GitHub or LinkedIn for data scientist for example is what we are looking for.

Generative AI and Expanded Offerings

01:18:34
Speaker
Okay. Okay. Okay. And like say GitHub has built this co-pilot product where it allows coders to have an AI supported experience to make them more productive. Do you think you'll build something like that for data professionals like a co-pilot?
01:18:50
Speaker
Yeah, that would fit brilliantly with what we are doing. And that's why I was saying we are not a tech company in the way we are looking. And that's the exact kind of problems you would want to solve. So I'm sitting with Kunal. Again, we recorded what you've heard so far.
01:19:09
Speaker
six months less back, I guess, more than six months ago. So I just wanted to catch up once with Kunal, get in some update before we release the episode. Kunal, a lot of action has happened in the space of the science machine learning, especially with OpenAI completely changing the field. Just share with us what are some of the reasons. It's how you respond, how is the business model evolved.
01:19:33
Speaker
Sure, Akshay, thanks for circling back on that. So it has been a crazy period in terms of the amount of development and the impact which generative AI has created. So I mean, when we were recording, if you would have asked me how quickly the pace of AI adoption would change, even in my wildest dreams, I wouldn't have predicted what has happened.
01:20:01
Speaker
So OpenAI released the chat GPT module first and then followed it up with GPT-4. And then I think just two days back, they are now releasing their first multimodal generative AI model, which is essentially combining tally and chat GPT. So you could now do chat as well as vision through a single model.
01:20:29
Speaker
So a lot of action, obviously. And I think this was probably the first time when a lot of people not in the field saw what generative way I could create. So all of a sudden, I think it was one of those moments when COVID lockdowns overnight, the interest in online learning grew.
01:20:52
Speaker
Similarly, when Chad Gipri released, I think within a week, everyone wanted to know what is possible, what is not possible, how do I use these tools. So, a lot of action has happened since. And the way we are looking at it is, you know, there are...
01:21:10
Speaker
largely two levels where in terms of how we are approaching it. So the first one is general users who can use these tools to increase their productivity. And so let's say I'm a blogger, I want to get a first draft out, I can use these tools to get these first draft out. I'm a designer, I want some design ideas to come through, I'll just give a prompt to medjani and it will throw back some idea.
01:21:40
Speaker
So, so great at creating these first drafts, doing some brainstorming, and then obviously human is still in the loop, so you need to optimize. So that's the end user game. So that's one sort of audience.
01:21:56
Speaker
And then the second audience, which is I think our core audience, are people who are building these technologies. So OpenAI is a proprietary model. You can only call APIs and then there are a set of restrictions about how you could use them, how you could not use them.
01:22:14
Speaker
Earlier, there is an open source, so LAMATO, which was released by Meta and Microsoft also had a partnership there. That's open source, but it's not performing as good as OpenAI models. So for any commercial use, LAMATO is a better model. So there are a lot of people
01:22:35
Speaker
People who are still tabling about, you know, how do you solve specific problems? How do I use these technologies on my own dataset and train, for example, let's say, an internal chatbot? Or can I use this to, you know, get first drafts of the code based on my existing
01:22:55
Speaker
code. So all of those applications which are very developer focused and that's the audience. So these are the two kind of separate audience and obviously the first one is a lot bigger, but that's not the market.
01:23:15
Speaker
focus so much on. The second one is where we focus on. So what we have done is essentially multiple things. First of all, we, again, truly believe that this needs to be evangelized. So we have started doing a lot of meetups across the country. So for example, yesterday we had three meetups winning in parallel in Chennai, Bangalore, and Hyderabad.
01:23:42
Speaker
It focused entirely on generative AI and its adoption, what are the challenges it comes in, how do you trust the models which we are building, etc. So that's kind of just evangelizing the subject. Then there are master series and workshops which we are doing, which are again, let's say, eight-hour workshop.
01:24:03
Speaker
for people who have committed that they want to tabble around, go and, you know, build these technologies. So that's the second offering which has come up, where we are saying that if you want to accelerate in this domain, within eight hours, you can come learn from the expert and then stay with this group as you continue to build these applications. So that's the... What is the pricing for this ATAR workshop?
01:24:29
Speaker
So the ATAR workshop typically would be between 10 to 15,000 rupees. So that's the idea and all the infra is taken care for you. So for example, we did the first set of these workshops along with DataHexamet and there were 200 people who were using these GPOs.
01:24:49
Speaker
throughout the day. So in fact, we tried reaching out to multiple cloud service providers and at that time no one was running these, you know, GPUs at such a scale. So that's the magnitude of these master series workshops. So that's the second level. And then third, we are coming up with a generative AI program, so similar to the Black Belt program.
01:25:15
Speaker
It would again be a subscription with one-on-one mentorship, and that's launching on 1st of December. So that's already out released to the community. They can sign up at the early bird release. But for the developer market, those are the three levels in the way we are approaching the world. Personally, I think we've been talking to a lot of experts.
01:25:43
Speaker
I think in general the feeling is this is one of the moments when the acceleration in the domain changes significantly and the technology is out there. I think the key question is how do you adopt and how do you
01:26:00
Speaker
put these in use cases and then still kind of solve some of the UI, UX issues for the users. So I think the most common use case which I see is in information retrieval. So you have a large set of documents, you just want someone to quickly summarize it. That's the easiest one and I think every company is trying to either build their own chatbots or you know, something in that direction.
01:26:28
Speaker
to something which is very very refined as well that you know you for example for analytics with there can we create a bot where you know people can just ask questions and learn instead of you know figure out their ways through numerous articles, numerous videos and all of these things can I just come and say you know I want to learn, generate
01:26:53
Speaker
P6 in the next two hours, do that for me and I'll follow what you say. So I think we are in the middle of that journey, but what is very clear is that this is going to have a huge impact, is going to fundamentally change how we use information products.
01:27:14
Speaker
Does this give you like a new boost because there would be at this stage at least a strong interest in learning generative AI. It's like the hot new thing. So it is one of those periods where you want to kind of maximize how much you can kind of
01:27:34
Speaker
created a friend. And what we are seeing is actually very interesting. So there are two directions in which we are seeing the revenue boost coming from. One is what I mentioned. So people who want to specialize in generative AI.
01:27:51
Speaker
Today, this knowledge is almost non-existent, right? So if you have to, for example, learn machine learning today, there are numerous options, numerous courses, but we have to really learn and build generative AI. There aren't many options out there today. So that's one area where that's clearly an incremental revenue stream.
01:28:10
Speaker
The second one is because it has now generated so much excitement, everyone internally is asking what can I do with these technologies? How can I improve using these technologies? And what we are seeing is a lot of companies and people want to learn
01:28:28
Speaker
a data literacy program or awareness program where even though you're not directly building these technologies, you could still be a salesperson, a HR person or a marketing person, but you need to know how to use these tools and technologies best and how to keep yourself up to speed with these. So those are the two kind of streams. The first one is where
01:28:55
Speaker
already have products out there. The second one is where it's more B2B in nature, but I believe that that's a much bigger market. So both of these are incremental revenue streams for us.
01:29:09
Speaker
One last question. What is the difference between GPT and an open source language model like LAMA? You host LAMA on your own server and you can just download the code for it or what is fundamentally the difference?
01:29:26
Speaker
So, this is the fundamental differences of chat GPT is built on proprietary technology. So, they're not releasing the exact models, the exact way it has been trained. In fact, that's kind of the secret sauce. And the terms and conditions of the model very clearly state that there are limitations about how you can use a commercial.
01:29:53
Speaker
And on the open source also there are different levels. So there are models, at least what they do, but they don't allow for commercial usage. LAMATU is the most open model in that sense. So it's a continuum, it's not just closed source and open source.
01:30:13
Speaker
But llama2 is the other extreme where it says that here is the model, I've got the base version out. Feel free to use it in the VU1. So that's the continuum. Now, depending on your use case.
01:30:29
Speaker
you would want to pick what is the best one. And I mean, while there are options to stop, you know, data sharing with chat GPT etc, that's still a concern. So organizations still feel that, you know, their data is going out if they want to use chat GPT for any internal
01:30:50
Speaker
products or any internal use cases. So that's where open source is going to have a much bigger impact because a lot of people are actually interested in building these open source models because they create a much bigger community impact. Over the next two to three years, they should kind of close the gap in the performance between what let's say an open AI model is doing.
01:31:14
Speaker
versus what these open source models are doing. So a lot of people are contributing, a lot of people are trying out different versions. And then again, you know, some of these are getting rained on 7 billion parameters. Some of them are getting rained on.
01:31:29
Speaker
100 billion per meter. So people are trying all of these things and then releasing it back to the community. So I think it will take some time, but my sense is in probably two years time, the performance gap should definitely narrow down. And once that starts happening, then you have, you know,
01:31:51
Speaker
a true boost of a lot of use cases, which are built on completely open source platforms. It's very similar to, for example, what Python did before that it was SaaS. It was very proprietary, so it was getting used in limited use cases. When Python came, they were completely open source, and then the domain proliferated. So similarly, I think as soon as we see better
01:32:20
Speaker
performance or closure of the gap. We'll see again another boost coming.
01:32:28
Speaker
So something like say Python would be like an executable software, something you can execute on your desktop. Is that what a lambda 2 is also, like an executable software? It's not there today, to be honest. I mean, today you would need to go on GitHub, you would need to install it on your machine or a server, depending on what
01:32:51
Speaker
But it's not too far away, right? So I can very easily say that, you know, here is an executable file which you can just download. And the bigger challenge right now is making sure you have the hardware, making sure that you correct, you choose the right.
01:33:10
Speaker
a number of parameters because if you choose very big, your normal laptop should not be able to suffice. So that's the challenge, but I think that that will be solved. It's not too far away from getting solved even today. And there are people who are trying out these different models, LAMR2 or 2GPT, a lot of people have installed on their laptops and then they are trying to
01:33:33
Speaker
automatic tasks. So a lot of that action is happening. I think some of these would result into new products, new softwares, new ways of doing things. Some of them would probably end up as good experiments. Irrespective, I think people need to figure out their way through GitHub codes and repositories. It's not too far away where you could just unclick a button, download and then get started.
01:34:02
Speaker
And that brings us to the end of this conversation. I want to ask you for a favor now. Did you like listening to the show? I'd love to hear your feedback about it. Do you have your own startup ideas? I'd love to hear them. Do you have questions for any of the guests that you heard about in the show? I'd love to get your questions and pass them on to the guests. Write to me at adatthepodium.in. That's adatthepodium.in.