Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
88.  Can OpenAi's SORA learn and model real-world physics? (Part 1 of n) image

88. Can OpenAi's SORA learn and model real-world physics? (Part 1 of n)

E88 · Breaking Math Podcast
Avatar
4.8k Plays11 months ago

This is a follow up on our previous episode on OpenAi's SORA. We attempt to answer the question, "Can OpenAi's SORA model real-world physics?" 

We go over the details of the technical report, we discuss some controversial opinoins by experts in the field at Nvdia and Google's Deep Mind. 

The transcript for episode is avialable below upon request.


Help Support The Podcast by clicking on the links below:


Recommended
Transcript

Wells Fargo Savings & Virtual Assistant

00:00:01
Speaker
Hello, saver. Whether you're saving for that trip to the tropics or saving for an emergency, now is the time to take advantage of Wells Fargo's savings options. Wells Fargo offers savings accounts that can help you save towards your goals. So, what are you saving for? Visit a Wells Fargo branch or wellsfargo.com slash save to open a savings account today. Wells Fargo Bank N.A. Member FDIC.
00:00:30
Speaker
Fargo, the new virtual assistant from Wells Fargo, makes banking faster and easier. Like this. Fargo, what's my checking account routing number? And this. Fargo, turn off my debit card. And this. Fargo, what did I spend on groceries last month? And that's just the beginning. Do you Fargo? You can. In the Wells Fargo mobile app. Learn more at wellsfargo.com slash Get Fargo. Terms and conditions apply. Your mobile carrier's availability and message and data rates may apply. Wells Fargo Bank and A member of DIC.

OpenAI's Sora: Can it Understand Physics?

00:01:01
Speaker
Can OpenAI's new text-to-video AI product, Sora, discover real-world physics? That is to say, can it figure out laws of the universe that humans have identified in the last 500 years? Things like gravity, thermodynamics, optics, various material or chemical properties exclusively through studying video footage, as well as having the benefits of a large-language model. That is, the ability to label objects to describe them, to understand their parameters, as well as how they change over time and how they relate to each other.
00:01:31
Speaker
Can it understand and model the real world the way humans do or possibly in ways that humans can't or haven't yet? Strange as this may seem.

Exploring AI Epistemology with Breaking Math

00:01:41
Speaker
We'll discuss this and more on this week's episode of the Breaking Math Podcast, which is our second episode in a series exploring text-to-video AI tools, as well as physics-informed learning tools, and exploring how they learn, what they learn. That is to say, the epistemology of these tools.
00:02:02
Speaker
Now, before I continue, I want to announce to everyone listening to this that we now have a YouTube channel. Well, rather, those who are not listening on YouTube right now. If you go to youtube.com forward slash at Breaking Math Pod, you'll see the Breaking Math Podcast on YouTube. I will have a series of interviews done live in the studio as well as a series of audio only videos where I'll probably add some images just for your enjoyment and for your benefit. So you can check that out. I've also added a lot onto the community tab of the YouTube channel.
00:02:30
Speaker
think this tab often gets ignored. I've noticed that a lot of my favorite creators on YouTube will put a lot of content on the community tab, especially their thoughts on a given episode. I intend to do the exact same thing here. So I want to add everything that I've read about Sora so far on the community tab. I also want to add everything I've read about

Professor Steve Brunton and Physics-Informed ML

00:02:49
Speaker
physics-informed machine learning, which is a huge field in engineering, by the way. In fact, there's a course that Professor Steve Brunton is teaching on YouTube right now. If you go to his channel, it's youtube.com forward slash at Eigen Steve. I'll also include notes to that in my comments in the notes section as well as in the community section. You guys should see that as well.
00:03:10
Speaker
So yeah, I'd love for you guys to be engaged in my community section of my YouTube. Oh, I almost forgot. I'm going to have an ongoing list of the biggest questions that we have in machine learning. The biggest questions. The hardest ones to answer that really explore the depth of machine learning and philosophy and epistemology or how we, and in this case, how machines know what they know. So please check out the YouTube channel when you get a chance.
00:03:36
Speaker
I think it's a good time to talk briefly about the purpose of this podcast series and this podcast in

Purpose & Ethical Discourse in AI Podcast

00:03:43
Speaker
general. With respect to this series, the purpose of this podcast is to elevate or to improve conversations about machine learning in public. I want to help to improve machine learning literacy.
00:03:55
Speaker
I think that as we as a human species speed full steam ahead into this new world of AI, we all need common public understanding of the technological capabilities and the technological limitations as well as the unknowns about these tools. How do we know what we know about AI?
00:04:15
Speaker
you know, how do they learn? I don't want to rely on hype in this channel. Rather, I want to ground everything that we say in the academic literature that currently exists. I want to make sure it's accessible. I don't want to use jargon. I do want to take an extra moment and define terms and, where appropriate and where helpful, use analogies and provide you guys with all the resources that you may need for those who want to learn more on your machine learning journey.
00:04:40
Speaker
I also want to talk a lot about ethical concerns about AI, from the artist community, from humanity in general. I want to talk about concerns with fraud and what's being done about it. Also about regulatory measures. I know that this perhaps goes outside the scope of mathematics a little bit.
00:04:57
Speaker
But I think it's warranted. I think if we are not conscious of the ethics of AI, then we'll have a lot more problems to worry about than just the purity of our math content on our favorite YouTube channels. So I thank you for joining us on this journey, and let me know your feedback. Again, I said earlier we've got a community section on the podcast, we've got a comment section, and we've got an email. If you just send us an email at breakingmathpodcast at gmail.com.
00:05:26
Speaker
Now, I think a really good place to start would actually be talking about some of the interactions in the comments section that I just got on my last video.

Feedback on Sora's Technical Report

00:05:34
Speaker
The last video was the first one that I did on Sora, and essentially, my intention with that video was to introduce Sora and talk about some of the prompts that were used to generate some of the amazing videos that were in the series. I also went over some of the details of the technical report, but as I've said and as others have said,
00:05:51
Speaker
The technical report, well, frankly, was a little sparse, a little sparse in details. Nonetheless, I want to answer some of the questions. Alright, let's see one of them. So one user says, clickbait title warning. You did not answer anything substantial on the comment. I'm sorry, these are his errors. Though interesting question, did you?
00:06:12
Speaker
Okay, so I want to, and again, I'm sorry about the grammatical errors. I make grammatical errors all of the time when I'm typing on my phone. And I apologize, I don't mean to read them out loud, but that's how it was on this particular question. So let me answer it real quick. So I answered, I said, hey, I do thank you for your feedback. The technical report didn't have a whole lot of details in it. Please do check it out for yourself on the OpenAI SOAR website.
00:06:36
Speaker
Full disclosure, we were a little shocked and in awe at the initial capabilities of this new tool, and this episode was more of a, is this even real? Am I dreaming? No, really. How do we even tell what's real anymore and what's made up anymore? Can anyone tell me? In fact, a fitting abstract to this initial public release may
00:06:52
Speaker
very well be the phrase, whoa, with a picture of Keanu Reeves on it with his mouth wide open. That about sums up the preliminary technical review, the practical limitations and present considerations for the useful and or dangerous applications of this tool for greater

Physics-Informed ML Series

00:07:07
Speaker
humanity. Fair enough.
00:07:09
Speaker
However, we are hard at work on an ongoing series on the present academia on the topic of physics-informed machine learning, using the best peer-reviewed publications on the subject as far as we are aware. The resources of the AI Institute in Dynamic Systems, brought to you in very accessible terms by the wonderful Eigen Steve here on YouTube, who I mentioned earlier in this episode.
00:07:30
Speaker
So yes, you're absolutely right on the video. This video is a little bit light on the technical side. However, the report itself only had the Vegas description of the training capabilities and limitations of the tool. And we were a bit blindsided by the implications of the technology. More to come soon. And I hope I hope to show you not tell you that we have some deep content to come and the discussions and the technical approaches to getting generative AI to produce something that is implicitly physics aware as well as have the ability to form its own
00:07:58
Speaker
generalized descriptions of the physical world.

Implications of AI for Artists

00:08:01
Speaker
Last to say on this topic, stay tuned and thank you for your feedback. And I mean that. I absolutely mean that. I think it is perfectly fair to say this particular episode was a little bit light on the technical specs, and I agree. Still, it was an absolutely mind-blowing reveal from OpenAI, and I was left just stunned with how it worked and the fact that in seconds it could produce astonishingly realistic videos.
00:08:27
Speaker
If you guys haven't seen the OpenAI videos, go check it out yourself. They're truly astonishing. And suddenly I'm thinking, well, what about all of the artists that work in Hollywood? What are they going to do? What are we going to do as a humanity when part of our existence in the meaning that we have is in creating art? And I just don't have answers to all these questions. This is more of a technical podcast, yet I still am considering the humanitarian aspect of things like OpenAI.
00:08:51
Speaker
These are all questions that we're all dealing with, and hopefully on podcasts like this and on other podcasts, we can continue the conversations about where we're at as a human species and what all this means. So I thank you for your feedback. I'd like to go and move on to discussing some of the aspects of the technical report on Sora that is available on the OpenAI website. There have been countless blogs and countless videos that have discussed it already.
00:09:14
Speaker
And one of them that I used was a blog on Medium, actually, and it was written by Mike Young, and it was very, very helpful. And Mike Young actually starts off, and he talks about how he's seen a lot of speculation on things like Reddit and Twitter, and he's found some tweets that he thought were hyperbolic, that were speculations about the capability of Sora, and he criticizes them. And actually, I think I may disagree a little bit with Mr. Mike Young in this case.
00:09:43
Speaker
So he says that, and I'll quote him, he says, the worst example I found of this, referring to the exaggerated speculations here, is Dr. Jim Fann's post claiming that Sora is a data-driven physics engine, which has been viewed by about four million times on Twitter. He then goes on to say it's not a data-driven physics engine at all. I'm still reading from his blog, he says,
00:10:06
Speaker
Fortunately, OpenAI released a research post explaining their model's architecture. There's no need to speculate if we read what they wrote, and then he goes through and he hits the major points here. I want to take a quick second here, and this is absolutely no offense whatsoever to Mike Young, but I haven't found myself agreeing yet that Dr. Jim Fann is incorrect in describing Sora as a data-driven physics engine, and there's a few reasons why.
00:10:34
Speaker
First of all, it comes to defining the terms. Unless there is a specific term or a specific machine learning use case that we refer to with data driven, then I don't know that that term is owned yet, or that term only describes one thing or another. If somebody has more information on this, please let me know. But OpenAI, of course, uses the tools available in a large language model, as well as hours and hours of being trained on video footage, recognizing video footage,
00:11:03
Speaker
labeling things, categorizing things. So there's a ton of data in there. So I suppose in a certain sense, it most certainly can be described as a data-driven physics engine. And again, to actually qualify it as a physics engine, we do not have definitions established in the technical report that establish the parameters or even the prior usage of that phrase, data-driven physics engine. So I just
00:11:32
Speaker
I don't know how comfortable I am saying that it is this thing or that it is not, or even how helpful that argument

Sora's 3D Transformations & Physics Limitations

00:11:38
Speaker
is. Now, let's go ahead and read a few of Dr. Jim Afan's posts real quick. Sorry to pause there for a second here. I had to pull up Dr. Jim Afan's X profile. I keep saying Twitter. I'm sorry. I know that Twitter, it's no longer Twitter. It's now X. I'm just not used to saying that.
00:11:53
Speaker
On Dr. Jim Fan's ex, formerly Twitter profile, he describes himself as an Nvidia research manager and lead of embodied AI in the Gear Group. Also, he says he's creating foundation models for agents for robotics for gaming. He has a PhD from Stanford, and he's actually open AI's first intern.
00:12:14
Speaker
Okay, and he also says that I am co-creating a new research group called G.E.A.R. at NVIDIA with my long-time friend and collaborator, Professor at YUKEZ. G.E.A.R. stands for Generalist Embodied Agent Research. He goes on and describes that group as, rather, he says about that group, we believe in a future where every machine that moves will be autonomous and robots and simulated agents will be as ubiquitous as iPhones. We are building the foundation agent, a general
00:12:42
Speaker
generally capable AI that will learn to act skillfully in many worlds. Okay, now I understand the criticism or rather the skepticism of any claims made as part of these researchers job is of course to sell themselves. So I certainly think that some skepticism is well warranted.
00:12:58
Speaker
Now, I think for a minute here, I'm going to read a few things from the quote here that is being referred to, or rather the tweet, the post. I'll read parts of it out loud here. It says, if you think that OpenAI's Sora is a creative toy like Dali, think again. Sora is a data-driven physics engine. It is a simulation of many worlds not, or sorry, real or fantastical.
00:13:23
Speaker
The simulator learns intricate rendering, quote intuitive, end quote physics, long horizon reasoning, and semantic grounding, all by some denoising and gradient math. I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!
00:13:41
Speaker
Let's break down the following video. Prompt photorealistic close-up video of two pirate ships battling each other as they sail inside a cup of coffee. Now, we watched that video earlier. We watched it on the previous Breaking Math episode in the intro. Perhaps I'll play it again in this episode, we'll see. He says, the simulator instantiates two exquisite 3D assets. Pirate ships with different decorations. Zora has to solve text to 3D implicitly in its latent space.
00:14:05
Speaker
Now, obviously, you know, we're not going to talk in great detail about implicitly and what latent spaces those are definitely technical terms are definitely considerations. But those are a bit of a rabbit hole and beyond the scope of this current episode of this time. He goes on to say that the 3d objects are consistently animated as they sail and avoid each other's paths.
00:14:23
Speaker
That is absolutely true. As you watch that video, there's nothing obviously fake. They're not passing through each other. They're going up and down with the water or with the coffee, and it looks absolutely fantastic. Speaking of coffee, he goes on to say the fluid dynamics of the coffee, even the foams that form around the ships, the fluid simulation is an entire subfield of computer graphics, which traditionally requires very complex algorithms and equations.
00:14:47
Speaker
So again, one of the questions that we have here is using the diffusion method that's identified in the technical report, where it starts off with a noise and slowly these filters denoise things and they learn about movement and how things change in it, you know, over time in videos, how much physics can it learn from that? And that question is a big, big question. That question cannot be answered in a single podcast nonetheless.
00:15:13
Speaker
That's an area that is a ripe area for research and discussion about, you know, again, what is the physics that you can infer. And of course, there's going to be limitations per object. Machine learning that's trained on a bunch of video is going to have some intrinsic understanding of how certain objects act, but it may not generalize why they act that way or be able to apply that to other objects, for example.
00:15:40
Speaker
you know, if you threw in something else that's wooden, it wouldn't know to treat it the same way you treat wood, you know, in coffee or in liquid. I'm kind of struggling here with how to explain it. It may know more about how ships appear in videos than about the actual physics of objects placed in water.
00:16:02
Speaker
So again, there's lots to dissect with the actual physics modeling capabilities of this tool. Nonetheless, it certainly has some understanding of physics, or rather at least how objects interact in 3D. There's a lot to say about this.
00:16:18
Speaker
He goes on to say a few other things like photorealism, like rendering and ray tracing. He also says the simulation takes into account the small size of a cup compared to oceans and applies tilt shift photography. Again, there's a lot of technical terms here. He basically, you know, he has a lot to say here. I'll include his tweet here in the show notes.
00:16:35
Speaker
And on his tweet, he includes the actual photo as well. And again, I'm watching these ships move carefully and it's just remarkable. There may be a split second where one of the two ships moves a little bit unrealistically, but I'm watching it again and it's just so hard to tell. It's done spectacularly well. And the foam stays around each ship.
00:16:57
Speaker
How does that happen? I think there's going to be a lot that will be revealed with some under the hood analysis, and I think we just have to wait for that to come. It's worthwhile to talk about a few other of Dr. Jim Fan's tweets. I'll be quick on this if you'll give me just a second.
00:17:16
Speaker
Thank you for your patience with those pauses as I pull up these tweets here. I just have to go from one tweet to another. So he's gotten a bit of criticism for some of his other tweets that he said that it has speculated about the abilities of Sora. So in one of his tweets, he says, and I'm going to quote directly, he says, I see some vocal objections that Sora is not learning physics. It's just manipulating pixels on 2D. I respectfully disagree with this reductionist view. It's similar to saying GTP4 doesn't learn coding. It's just sampling strings.
00:17:42
Speaker
Well, what transformers do is manipulate a sequence of integers or token IDs. What neural networks do is just manipulating floating numbers. That's not the right argument. So our soft physics simulation is an emergent property as you scale up text to video training massively. GPT-4 must learn some form of syntax, semantics, and data structures internally in order to generate executable Python code. GPT-4 does not store Python syntax explicitly.
00:18:09
Speaker
Very similarly, Sora must learn some implicit forms of text to 3D, 3D transformations, ray traced rendering, and physical rules in order to model the video pixels as accurately as possible. It has to learn concepts of a game engine in order to satisfy the objectives.
00:18:25
Speaker
If we don't consider interactions, Unreal Engine 5 is a very sophisticated process that generates video pixels. Sora is also a process that generates video pixels based on end-to-end transformers. They are the same level of abstraction. The difference is that Unreal Engine 5 is a handcrafted, precise, but Sora has purely learned through data and intuitive. Will Sora replace game engines? Absolutely. I'm sorry, let me say that one more time.
00:18:53
Speaker
Will Sora replace game engine developers? Absolutely not. Its emergent physics understanding is fragile and far from perfect. It still heavily hallucinates things that are incompatible with our physical common sense. It does not yet have a good grasp of object interactions. See the uncanny mistake in the video below. Sora is the GPT-3 movement. Moment, lord.
00:19:19
Speaker
Thank you for your patience as I read this.

Debates on AI's Emergent Intelligence

00:19:21
Speaker
Sora is the GPT-3 moment. Back in 2020, GPT-3 was a pretty bad model that required heavy prompt engineering and babysitting. But it was the first compelling demonstration of in-context learning as an emergent property. Don't fixate on the imperfections of GPT-3. Think about extrapolations to GPT-4 in the near future.
00:19:44
Speaker
Now, for the purpose of this podcast, I'm not going to judge what he said here. Again, I understand that he has perhaps some stake in selling this kind of product. But again, I'm not saying he's wrong here. In fact, he has defenders. There's another tweet response from Google DeepMind's team lead, and please forgive me on the pronunciation, Nandi De Freitas, or at Nando DF. He responded, he says,
00:20:13
Speaker
I agree with Dr. Jim Fenn. Life, with all its mind-blowing structure, is about creating order in a universe of increasing disorder. He then provides a quote from an article in New Scientist. Like a cell, a neural network during training takes energy to minimize disorder. That is to predict and generalize better. In fact, we even call the loss negative entropy. Like life, the net is part of a bigger environment that gives it data and feedback. Like life, the process results in a lot of disorder for the universe.
00:20:43
Speaker
and it gives the examples of TPUs and GPU heat. In summary, we have all of the ingredients for intelligence, an emergent property of life, including our understanding of physics. I'd be thankful someone makes a crisper version of this argument.
00:20:59
Speaker
Now, this part is critical. He then adds that the only way a finite-sized neural net can predict what will happen in any situation is by learning internal models that facilitate such predictions, including intuitive laws of physics. Given this intuition, I cannot find any reason to justify disagreeing with Dr. Jim Fann. He then adds, with more data of high quality electricity, feedback, aka fine tuning and grounding,
00:21:24
Speaker
and parallel neural net models that can efficiently absorb data to reduce entropy, we will likely have machines that reason about physics better than humans and hopefully teach us new things. Incidentally, we are the environment of the neural nets too, consuming energy to create order, that is to say, increasing quality of datasets for neural net training.
00:21:44
Speaker
These are old ideas going back to Boltzmann and Schrodinger, among others. They provide the theoretical foundations. Now it's about building the code and conducting the experiments and doing so reasonably, safe and safely, because these are very powerful technologies. So there's a lot of opinions that are philosophical on these. I like to share them just for your engagement and for your consideration. And they are not without their critics. As I had mentioned before,
00:22:13
Speaker
even Mike Young was a pretty harsh critic on at least what Dr. Jim Fann had said regarding whether or not Sora can be considered a data-driven physics engine. So without further ado, let's go ahead and talk a little bit more about what's in the technical report that came along with
00:22:32
Speaker
Sora. Now, the first thing worth mentioning is that Sora has something that's a lot like tokens in large language models, tokens which are chunks of data that are broken up into a small, well, a token, a small component, where I can process them much easier.
00:22:49
Speaker
There's a similar architecture in Sora. They refer to them as patches. I'm going to quote directly from the article here by Mike Young here. He says that this patch-based approach that Sora trained in videos of widely varying lengths, resolutions, orientations, and aspect ratios, the patch extracted from frames are treated exactly the same way regardless of the original shape of the Sora's video.
00:23:15
Speaker
It's also mentioned that Sora, like other architectures such as GPT-4, GPT-3, and even GPT-2, is a transformer-based architecture, and it works by making predictions based on these patches where the videos are still noisy. It makes predictions on what the denoised images will eventually be, and those predictions help to inform the weight of the various neurons throughout the neural network architecture.
00:23:41
Speaker
By the way, transformer architectures are fascinating, and if you want to learn more about them, there's a video. There's several videos on Britt Cruz's channel that talk about machine learning and his most recent one on a 30-year history of machine learning. And forgive me, that is not the title. But if you go to Britt Cruz's channel,
00:23:58
Speaker
One of his latest videos about chat GPT 3 and 4 has an outstanding section on exactly how transformer architecture works and why it's so important in the process of machine learning. I'll include that in the show notes.
00:24:13
Speaker
The technical report then goes on to talk about some of the things that being a patch-based machine learning model allows for, such as handling any resolution or duration or even orientation at test time simply by arranging the patches in any desired shape before starting the diffusion process. Again, I'm reading from Mike Young's report on Medium.
00:24:34
Speaker
There are some interesting claims that we've talked about before. For example, the report does say that Sora appears to develop some kind of an understanding of 3D scenes where characters and objects can move realistically in a continuous manner. There's not much more going into this specific detail. Obviously, if it learns a lot of videos, of course, it'll have some intrinsic familiarity with them. But how much actual information on the movement of objects does it know?
00:25:04
Speaker
really say here it does mention object permanence and it says that it can keep track of entities even when they temporarily leave the frame or they become occluded there's an example that's mentioned later on where there's a sign with some text and some people walk in front of the sign where it's occluded and then the text is in the same it's presented in the same way after the people leave so
00:25:26
Speaker
There's certainly an awareness of object permanence, and that may or may not have to deal with its awareness of 3D environments in general or just appearances. It also says that Sora has the ability to simulate some basic world interactions, like a painter leaving strokes on a canvas,
00:25:46
Speaker
Now there are some problems with it. For example, there are some objects that appear spontaneously such as puppies. Those videos we talked about in the last episode where we do talk about the gray puppies where there's one, then two, then three, and they just kind of appear out of thin air. So not everything is perfect. Now again,
00:26:06
Speaker
If we're going to talk about what this actually means for the ability to model physics, unfortunately, we just don't have enough information right now.

Future of AI Learning with Physics

00:26:15
Speaker
We don't know why it does some objects very, very well and other objects not so well and what that means in terms of how the objects are understood or categorized or generalized. It's too complex of a system to talk about in any great detail aside from simply saying that there are imperfections that are still being worked out.
00:26:35
Speaker
There is a concluding section here in Mike Yang's article where essentially he says, so contrary to unfounded claims, Sora operates not through game engines or as a data-driven physics engine, but through a transformer architecture that operates on video patches in a way similar to how GBT operates on text tokens.
00:26:53
Speaker
Okay, so again, we have here a clear distinction between what SOAR is and how it operates and other data-driven physics engines. I do wish that there was an example of explicitly what is considered at least by Mike Young as a true data-driven physics engine.
00:27:14
Speaker
Perhaps he's referring to game engines themselves. But again, who owns that term data-driven physics engine? What has that term traditionally been used for? And is there some exclusivity to that label? I don't know. I'm not qualified to say that, but again, I'm not in a position right now where I want to say that Sora is not a data-driven physics engine because it very well may be. We just don't know that.
00:27:40
Speaker
Now, as this series continues, we are going to continue to talk about Sora and its capabilities as a possibly physics-informed machine learning model and why it may or may not be the case. I think that the question warrants exploration and further exploration. We're going to dive into a lot of content that has been created and provided for the public by Professor Steve Brunton on his channel.
00:28:03
Speaker
In his wonderful series on physics-informed machine learning, he first talks about five essential steps involved in machine learning. The first steps are, you know, they involve identifying the problem that you're wanting machine learning to solve, if that's the right way to say it.
00:28:22
Speaker
identifying what it is that you want machine learning to do, whether it be a generative model or something else. Step two is identify the specific data that you need to gather that you're going to train the model on. There's more steps provided as well that are very, very helpful.
00:28:40
Speaker
I mention these because in his video, he talks in detail about how to infuse physics knowledge in every single one of these steps. First of all, just identifying the problem that you want to solve may have some physics knowledge inherently in it.
00:28:56
Speaker
Now, I look forward to watching his future video that just launched as well, because the future video is about that very first step. How does identifying the problem, how can you have some physics knowledge inherent just in how you identify the problem?
00:29:11
Speaker
And then, of course, later steps involve things like deciding on your architecture, which is probably the single most exciting step. I think it's exciting because this is what Professor Brunton describes as the current alchemy of machine learning. What he means by that is there are many, many different architectures of machine learning.
00:29:30
Speaker
There are convolution models, there are recurrent neural networks, there are models that are better at identifying images or models that are better at time sequence events like, oh gosh.
00:29:46
Speaker
Hey, that's Siri. Siri just interrupted us. Did you hear us? It said, sorry, I can't search for videos on my watch. Somehow Siri picked up something we're saying. What I wanted to say though, and thanks, Siri. I'm going to leave that in here with the editing here. Sorry, guys. With Steve Burton's video, essentially he talks about two major things. The first one is how to infuse physics natively into a machine learning process.
00:30:11
Speaker
But the second thing he talks about, and he does not go into great details, how do you get machine learning itself to learn physics? How do you get it to learn physics? And this is a fascinating question. This question has been approached for many, many years now, and there's many discussions on how you can set up machine learning, considering everything that one can control in setting up machine learning.
00:30:35
Speaker
and also considering the vast amounts of processing power that we now have access to. How can you not only identify a problem, but also get some training data, but also also pick the correct architecture, but also then pick some kind of a reward function. What are you going to, how are you going to encode and specify the parameters of a reward function? Do you simply build a reward function
00:31:02
Speaker
around an understanding of a law of physics. Yeah, that's one way to do it. That's how you can, you know, get a machine learning to have some output that appears to have some knowledge of physics. But that doesn't answer the question of how can you get machine learning to learn the law itself? And then lastly, there's the optimizing component. How do you optimize or
00:31:31
Speaker
there's an optimizing function that responds to the reward function that basically tweaks the parameters and that's kind of a response like how do you change or how do you make the changes during training so that you can demonstrate being informed by physics. These are all wonderful questions that are explored in great detail in Professor Brunton's video and I think that might be what my next podcast is on, we'll see.
00:31:55
Speaker
But again, none of this answers the question of how do you get machine learning to learn physics on its own. And I do think that some part of that answer is studying video footage similar to how Sora does it, but also making generalizations and being able to look at past data.
00:32:16
Speaker
These are huge questions and they're all involved. That's all the time we have for this episode. I thank you for listening.

Support & Contact Information

00:32:24
Speaker
My name is Gabrielle Hesh. This is the Breaking Math Podcast. Please email us at breakingmathpodcast.com. You can also check out our new website. It is at breakingmath.io.
00:32:34
Speaker
We'll have all of the transcripts posted very shortly at BreakingMath.io. And please check out our YouTube. The best way that you can help us right now is to subscribe to our YouTube channel. We are trying to get past 500 subscribers, so if you could add your, you know, if you could consider adding us, we would be delighted. Also, of course, we welcome support on Patreon. If you go to patreon.com forward slash BreakingMath.
00:32:59
Speaker
you'll find our site there. And I believe right now the monthly tier for having commercial-free episodes starts at $5. I'm actually thinking I might just drop that down to $3, I think, because I think that's more market value for other similar podcasts. So by the time you read this, it may be down to $3. We would appreciate your support, and thank you very much.
00:33:23
Speaker
Hello, saver. Whether you're saving for that trip to the tropics or saving for an emergency, now is the time to take advantage of Wells Fargo's savings options. Wells Fargo offers savings accounts that can help you save towards your goals. So, what are you saving for? Visit a Wells Fargo branch or wellsfargo.com slash save to open a savings account today. Wells Fargo Bank N.A. Member FDIC.
00:33:52
Speaker
Fargo, the new virtual assistant from Wells Fargo, makes banking faster and easier. Like this. Fargo, what's my checking account routing number? And this. Fargo, turn off my debit card. And this. Fargo, what did I spend on groceries last month? And that's just the beginning. Do you Fargo? You can. In the Wells Fargo mobile app. Learn more at wellsfargo.com slash Get Fargo. Terms and conditions apply. Your mobile carrier's availability and message and data rates may apply. Wells Fargo Bank and a member of DIC.