Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Avatar
35 Plays3 months ago

In this episode, host Dr. Jill Fennell sits down with Dr. David Torello, Director of the Clark Scholars Program and the College of Engineering's Dean's Scholars Program in the Woodruff School of Mechanical Engineering at Georgia Tech. The two discuss code commenting, an oft overlooked communication genre in Engineering, and consider best practices and use cases for creating effective code comments. 

Show Notes and Timestamps:

  • 00:54 - Introducing guest
  • 02:02 - Code Commenting as Communication
  • 03:14 - Purpose of Code Commenting
  • 05:44 - Mapping out Communication for Code Commenting
  • 09:17 - Genre Expectations for Code Commenting
  • 15:33 - Audience needs and Usability of Code Commenting
  • 23:02 - Code Commenting Ethos and Impact of Code Comments
  • 24:53 - Trustworthiness of Code Comments
  • 30:29 - Quality of Code Comments
  • 34:15 - Student Perspectives on Code Commenting
  • 37:56 - What David wants to see from Code Comments
  • 42:04 - Advice for students wanting to keep their skills fresh
  • 46:14 - General advice for code commenting and understanding communication

A transcript of this episode is available here.

Episode edited by Lee Hibbard.

Recommended
Transcript

Introduction to Code Commenting

00:00:40
Speaker
Whether you're tuning in for the first time or returning for another season, we're excited to have you with
00:00:50
Speaker
us.
00:00:54
Speaker
Today I'm joined by Dr. David Turello. He is the director of the Clark Scholars Program and the College of Engineering's Dean's Scholars Program. Welcome, David.
00:01:06
Speaker
Thank you, Jill, for having me. Thanks. Today we're going to be talking about code commenting. Is this really communication? Yes, it is communication. And it's a very important part of communication in a way that is not immediately obvious to many people who are writing code.
00:01:24
Speaker
Because code is normally thought of as something that is very deterministic. You write a piece of code. It's a computer operation. it is a line, a thing that you type, and a computer interprets it in a one-to-one fashion. It only does one thing, the one thing that you ask it to do, which doesn't sound a whole lot like communication sometimes.
00:01:45
Speaker
Ask a computer to do something and get a result, but it's not the computers that are really using the code. It's people that are using the code. And so whenever you have people interacting with a document or interacting with a piece of technology, it is a form of communication.

Code Comments as Communication Tools

00:02:02
Speaker
Okay, so what's the purpose then? What's the purpose of commenting your code at the code level? Well, there are two purposes. The first is it's a love letter to yourself. It is very easy to write a piece of code to make something in Matlab, to write something in Python. It honestly doesn't matter what language you're writing in.
00:02:22
Speaker
And then you come back to it two days later and you have no idea what you did because coding is not typically done in natural language. It is done in whatever instruction set the computer can understand.
00:02:33
Speaker
And it's not very readable most times, even by you, the person who wrote it. And if you forget the context in which you wrote the code, it can be just as hard to figure out what the hell you were trying to do in the first place as the other person who you're going to give it to at some point.
00:02:49
Speaker
And the other thing about commenting is that you aren't writing code often just for yourself. You're writing it for... incorporation into larger efforts. And so if you're writing with anybody else in mind, you need to be able to tell them what you did.
00:03:06
Speaker
And as anybody who has ever tried to read code knows, it is not 100% of the time self-evident what code is supposed to be doing. We have our code and then that the computer is going to run.
00:03:17
Speaker
And then we have the code comments, which is really just for the human... people who are working the computers. It's literally just for the humans because the whole point of commenting code is that it is non-executing instructions that just live inside of the code. the computer is in fact instructed not to pay attention to everything that is coming there in the code comments.
00:03:38
Speaker
It is purely for humans. so So then why should anyone take the time to comment code correctly when comments non-executed code? Well, again, the point of code is not necessarily just to write it and have a computer do it.
00:03:55
Speaker
It is usually towards some larger effort. You are doing something that is in many ways wrapped up in a larger framework, in most cases in modern engineering practice, or even if you're just working in a group project in a course in our mechanical engineering discipline.
00:04:11
Speaker
So if you're interacting with any other people other than yourself, you owe it to them to not write completely decipherable garbage.
00:04:24
Speaker
And it's really easy to write completely in decipherable garbage. When you're coding, i write it I write code for myself all the time, and it comes out as

Challenges with Poor Documentation

00:04:31
Speaker
completely inciperable garbage. And then I go back to try and edit it, and I have no idea what I was doing.
00:04:35
Speaker
And again, this is like two days later sometimes, three days later, and God forbid it's a month, a week. I mean, when I was doing my doctoral work, I had, i don't know, hundreds of pieces of code that were interacting with each other, and each one of them could be a couple thousand lines of code, it's not really that hard to start writing these really large research codes.
00:04:54
Speaker
And there were some operations, particularly early signal processing operations that I was trying to quickly get through where I didn't really document my my process. and stuff about like file input output how do i read this file where is this file coming from where is it stored and just really poor documentation at that level of where things are coming from and how it all fits together and what this is trying to do and which piece of math this is using i mean There were many times when I would have to rewrite stuff that I'd already written because I couldn't figure out exactly what it needed to function.
00:05:31
Speaker
And if I was getting an answer I didn't expect, it was very hard to process why that was happening in this poorly commented code. So personal experience, unfortunately, with why commenting code is very important.
00:05:43
Speaker
It sounds then that we can apply our sort of traditional structure for mapping out a communication so that we can make better communication decisions to this. You have a ah communicator, in this instance, the code writer.
00:05:58
Speaker
You have an audience, the future person who's going to be using the code, be that yourself or somebody else. And you have a purpose of the code comments, which is to clarify their use value or whatever specifics is going on.
00:06:12
Speaker
And so that makes the basis of the rhetorical triangle, but layered onto that is the context. Right. and i assume that this is really important to consider whenever you're trying to make the decisions for what the future user of the code and code comments are going to want to do with it.
00:06:27
Speaker
Absolutely. And again, the real cardinal sin here is to assume that coding is some...

Communication Structures in Code

00:06:33
Speaker
objective truth, that there's one right way to write code or that the point of it is to find a right answer.
00:06:42
Speaker
I mean, that's kind of a problem that a lot of people have with engineering work in general is thinking that there's just a right answer to a problem. But so much about coding, while seen as this extremely fact-driven, knowledge-driven, deterministic process, is in fact taking into account all the things you just mentioned in the rhetorical triangle that And not just in some sort of tertiary way, but at the entire core of its existence, it's doing these things.
00:07:10
Speaker
it is It is communicating to an audience a result based off of a set of discrete choices that you made. and There's not one right way to code even the simplest of operations.
00:07:21
Speaker
There are certainly... multiple ways to solve problems physically. And if your code is some sort of manifestation of a physical model, but just solved using some numerical method, those two things interact and create ah huge space of interpretation.
00:07:39
Speaker
So understanding the point of the code you're writing, understanding who it's going to, and understanding kind of what you did and how you're going to communicate that result is...
00:07:51
Speaker
more than just a nice thing to do for somebody. It's an important and necessary part of the entire process. I mean, it's fundamental. Right.
00:08:01
Speaker
it's It's literally the fundamentals of communication. You have to understand the circumstances of the user of your communication in order to make good communication decisions.
00:08:13
Speaker
Yeah. And I think at this point, mike my students are sick of hearing that from me. I've certainly been trying to beat it into their heads this semester. And at this point, I can't tell if they're humoring me or if they're just playing along.
00:08:25
Speaker
But hopefully the end result is the same, which is that they consider these things which we think to be true, objectively true, that there's knowledge out there to be discovered and realize that really what's going on here is how we approach the problem and how we communicate our solution defines the knowledge that is used to solve our engineering problems. And that that manifests completely in the domain of writing code and writing code comments.
00:08:49
Speaker
That's really interesting because that shatters this idea that it's self-explanatory or apparent in some way. And in fact, the fact that you could have chosen a multitude of different decisions and you chose this one for a specific reason that the future user of the code might want to use or might want to go in and change.
00:09:11
Speaker
and that that needs to be communicated just in a code comment it seems really difficult so i want to get back to the audience at some point but right now i think maybe we should talk a little bit about the genre expectations of code commenting could you tell all our listeners a bit about that Sure.

Consistency in Commenting

00:09:28
Speaker
um The most important thing to do when you are writing code comments is to make them as concise and as clear as possible.
00:09:36
Speaker
So we just talked about how people think of coding as deterministic, and reality, coding is so much more than that. And yet here I am about to tell you some things that are fairly deterministic. Well, it's a genre expectation. Sure. And and the the reason why I started thinking about that because of what you said is that we have genres in order to help us make meaning out of the vast expanse of possible meanings. Right. Genres help us constrict.
00:10:02
Speaker
and think about things in a certain way so that we can absorb information faster. Right. Yeah. So if we're thinking about norms of communication within coding, certainly there are a few more axiomatic things that you can keep in mind. So the first is clarity and precision.
00:10:17
Speaker
ah Just like any engineering communication, you're striving for one-to-one meaning between what you say and what you're supposed to interpret from that. There's not supposed to be any ambiguity here.
00:10:29
Speaker
And that... It actually still tracks with kind of everything we're talking about. ah What are some examples? if you're If you're trying to calculate the average of a data set, you don't just say calculate the average, by the way, because there are many different types of averages, in fact.
00:10:46
Speaker
So if you're calculating the mean, say you're calculating the mean, but also don't say calculating the mean of the entire data set over the run from blah, blah, blah, unless all of that is important. If all you really need is take any arbitrary input ah vector of numbers and calculate the mean of that vector, then the code comment should basically just be calculate the mean of input vector blank.
00:11:08
Speaker
ah It is short. It is correct. It is clear. And oftentimes it is sufficient. So clarity and precision are things to strive for. And precision means usually brevity in some ways, or at least they're very linked, which can be really difficult to hit if you're a windbag like me, who just likes to talk and talk and talk and write and write and write.
00:11:30
Speaker
The other thing that I like to make sure is happening, at least in my commenting, and I think it's very important, is consistency. So if you comment one way, in your code.
00:11:41
Speaker
You should comment that same way everywhere in your code. Exactly. I mean, not doing that just really increases the cognitive load on your reader. Yeah, and you you start now violating your own norms that you're trying to set. i mean, there's no hard and fast rule of what your comments should look like, but at a minimum, a reader reads one or two and now they know what to expect, and all of a sudden they're looking for this type of information quickly at a glance. If you have thousands of lines of code, they're trying to parse all of this and find the salient bits for them.
00:12:08
Speaker
And if all of a sudden your commenting has changed form completely, if you were doing inline commenting with complete sentences and all of a sudden they're looking for bulleted lists of information, their eye won't catch that necessarily very quickly.

Intent Over Operations in Comments

00:12:21
Speaker
And now you've just kind of created another layer for them to miss in terms of figuring out what you were trying to do. So if you have a standardized ah standardized format, you will want to stick to it the entire time.
00:12:35
Speaker
And there are different ways to comment your code in terms of the practical, actual nuts and bolts of doing it, but whatever you choose, just continue to do it the entire time. and Another thing that's really important to understand is that commenting should not just be another statement of the code that you've written.
00:12:56
Speaker
So the classic example of this would be a counter and a for loop. If you have a for loop and you're indexing that for loop using some counting variable, the classic one in MATLAB coding, which is For better or worse, my home coding language, because I am an engineer, not a computer scientist. But in MATLAB, you use for loops, and you're always taught I. I is the counting variable. And this is actually somewhat true in most most coding preparatory environments.
00:13:20
Speaker
So now every time you run a for loop, you're supposed to iterate your counter. if you want to keep track of that for some reason. And so they'll say counter maybe, or i counter equals counter plus one. And at that point, you now want to explain what you're doing, because if there's a discrete operation, you want to you want to code it. And then what people will usually say, is something the lines of when they're first starting is, comment, add one to counter.
00:13:46
Speaker
which is basically just the exact same thing as what you've written in code. It's not telling the user anything they couldn't read, and this one is very natural language readable. And there are some coding languages in which what you just wrote on that back end, honestly, you give that to ChatGPT and it's going to write the same line of code back to you almost verbatim.
00:14:04
Speaker
So what's the proper thing to write then? The intent was that you are iterating the counting variable by one. or updating the counting variable by whatever. So that would be a better comment? That would be a much better comment.
00:14:20
Speaker
Change the counting variable to the next value. Why does that make it more usable for an outsider, a reader, or even future you? Well... Those operations, this the counting variable situation is simple enough that nearly anybody who comes to read it and has a basic understanding of arithmetic could probably figure that out.
00:14:41
Speaker
But as you get more and more complicated in the math you're doing, especially as you start embedding math inside of other math or functions inside of other functions, that really quickly readable connection between this output, this final thing that you're trying to do with this line of code and what's going into it,
00:15:00
Speaker
can become very blurry very quickly and even sometimes almost impossible to run down if it's tens or hundreds of pieces of code that are interacting with each other, which is, for better or worse, a modern coding environment.
00:15:13
Speaker
So if all you're doing is saying, add one to input of blank, what you're missing is the point of adding one to input of blank and what you're getting at the back end. So what you should really be focusing on is maybe the output of what's happening with that line of code as opposed to not strictly just the operation itself.
00:15:33
Speaker
I think that this is a a good time to go back to thinking about really empathizing with the audience and why thinking about the usability of your code comments is really important because one thing that I tell all of our students as a part of the web program and throughout the the Woodruff School is that the engineer's job is to help inform decision makers. Right.
00:15:58
Speaker
How does code comment or code commenting help inform decision makers?

Different Levels of Commenting

00:16:03
Speaker
That is an excellent question because decision making when it comes to interacting with code comes at so many different levels and so many different stake holding capabilities that it can kind of make your head spin when you really think about it.
00:16:13
Speaker
Let's say that you are another coder and you're working. you' The audience is now, I'm passing off a code to another coder. That makes a lot of sense in the comments or in the context of commenting code.
00:16:25
Speaker
You are trying to make this readable for someone who is going to use it Great. So now you want all the stuff we talked about clear, concise. You want to avoid redundancy, right? that's Now it's interacting with one person. Hopefully you've done that job. They know what's going in. They know what's coming out. But by the way, there are so many different levels of which you can comment your code for that audience. So you can comment each line of code.
00:16:49
Speaker
or you can break it up into sections of code and have section headers and comments for each of those different sections, which is just grouping operations together that should be naturally grouped together. You have function level descriptions, so at the top of the function, that would be consistently referred to as a header. That defines what the entire function does and doesn't really get into the nitty-gritty detail of each line, but it tells you the more macro important information like what are the inputs,
00:17:16
Speaker
what types of functions or variables or data types should be the inputs and what's going to cause an error and ultimately what the output is and kind of the point of the whole thing, which is it does this thing. It takes a Taylor series. It performs a Fourier transform. It finds the peaks of this waveform. These are very common engineering requirements of code.
00:17:37
Speaker
And then that point, the commitment to the individual coder kind of ends because they know what your little thing did. Mm-hmm. But now you have your next audience, which would be a group of coders or a collaborative environment, a coding department necessarily. i don't know if you'd even really call it a coding department, but a department of some decision makers who all interact with this code.
00:18:00
Speaker
And now you have a whole different cluster of problems that you have to solve with your comments, which is how does this interact with other pieces of code? that comes at the header level so that's kind of where we left off the individual coder but now if one thing changes it impacts every other thing or can impact every other thing in that larger collection of code so in the collaboration sense you need to keep track of an entirely different piece of information which is changelogs and if you've changed something in the code and someone doesn't see that and expect that as they're interacting with it if they maybe
00:18:36
Speaker
interacted with your code two iterations ago when things were one way. You were taking some sort of interesting Fourier transform to do your ah signal analysis. But now you've changed it because Fourier transforms weren't it.
00:18:50
Speaker
And you need to go to wavelet transform. And now you're using wavelet transforms. There's all sorts of assumptions that are now embedded in this operation. And if you didn't put that somewhere in the change log, then they're not going to have any clue of what's coming out because the data may look completely different.
00:19:06
Speaker
And it gives you a vector or a matrix of numbers. You will get data to be manipulated, but it will be unfamiliar data. And they had no shot at understanding this because your code is just named signal processing.
00:19:18
Speaker
Right. Would there ever be a situation where you were maybe a part of a larger firm and your supervisor told you, ah yeah, we we have some code in the code library that will do this, you know, take a look at it, see if we can do this job faster because we have it in the library.
00:19:35
Speaker
Would there ever be a situation where you would take a look at that, either there are no comments or think that the comments are so terrible that you wouldn't want to run it? Huh. You mean like most of the time I've been having to do this in my jobs? Yes.
00:19:48
Speaker
Well, what's the risk there? The risk is, honestly, it's everything. You cannot afford to put code into production that you do not understand and have not tested.
00:20:03
Speaker
One of my favorite things to bring up here is the concept of cargo cults. um It is a fascinating subject. The origin of the phrase cargo culting is a whole story in and of itself.
00:20:13
Speaker
But the the bones of it are when we were making air bases in World War II, we would go to an island and there would be an uncontacted civilization and we would do all the things that is required to deliver cargo.
00:20:25
Speaker
like building air traffic control ah towers and making runways and all the control sticks and you know the the glowy the glowy lights and the headphones. And then we leave because the war's over and the uncontacted civilizations saw that when we did certain things, supplies happened and supplies are good. We want supplies, we want food, we want materials.
00:20:46
Speaker
So they would go to those structures and they would make headphones that looked exactly like our headphones. They would make glowy sticks, maybe not glowing, but certainly sticks that looked exactly like our glowy sticks. And they would enact the rituals and nothing happens. There's no cargo.
00:20:59
Speaker
There's nothing there.

Ethical Responsibilities in Coding

00:21:01
Speaker
And it's and so this phrase has now been subsumed into all sorts of disciplines. But in the world of coding, it's just taking code that someone else wrote and kind of like understanding vaguely what it's supposed to do.
00:21:13
Speaker
and then importing it completely into your code, thinking it's going to do just that thing, but it does either not that thing or something totally unexpected and different, and you've now imported a bunch of bugs.
00:21:23
Speaker
But cargo culting is what this is called. It is taking something that you vaguely have have observed work and then wholesale importing it. into the thing that you are trying to do. Yeah, performing the ritual and thinking that you're going to get the result that you want. And actually having no idea what the result's going to be. Yeah, and you never had a shot. And you're much more likely to caglicult something on accident if that thing is not commented, by the way, because you would never really have been able to peer into it and maybe it's written in a language you're not familiar with or it's too complicated, the expression of that language.
00:21:54
Speaker
These are all possible things and things that I have personally dealt with. But the point is... you need to be able to go into code and understand it before you can use it. That's the minimum entry.
00:22:06
Speaker
And if you can't reliably say that, it is an immoral or unethical act that to take that code, first of all, that attribution, cite your work, and second of all, to or just to take it, not understanding it completely, and put it into production. You could overwrite ah database on accident and get fired. o like this and i've i You can go to any corner of Reddit that deals with coding horror stories, malicious compliance, or things of that nature. right That's a very fun subreddit, by the way.
00:22:39
Speaker
um And hear about over eager, overconfident programmers doing their own thing and they've launched something into production. And now the IT person, what was that? Tales from IT, I think is another subreddit, something along those lines, or IT t horror stories where some overconfident programmer goes in and now IT has to clean up the mess with all their tape drive backups.
00:22:58
Speaker
You don't want to be that person. That person gets fired. Thinking back to the the rhetorical triangle of trying to understand code comments, as the communicator of that aspect, it sounds then that developing your ethos in the code comments is really important if you want to be a valuable coder at a firm. Sure, yeah.
00:23:19
Speaker
You will also quickly see that there are different styles ah that people come up with. um And the styles can be anywhere from helpful to condescending.
00:23:32
Speaker
And you don't want to be the person who condescends to others purely through written code commenting. I mean, that's pretty impressive, honestly, if you can come across that way. But it is not good. And I don't really know how else to say how much impact the way that you choose to create these lasting communication artifacts embedded in this material but how how important that impact is and how far reaching it can be.
00:23:57
Speaker
You know, it it really just, it takes something which seems like a menial piece of work and turns it into an existentially important question for that entire operating assembly of ah codes that are working together. I mean, if you if you don't tell people what your code is doing, they can use it incorrectly and cause damage.
00:24:21
Speaker
where and it seems kind of sad, can never be used. It could never be used. I come from the humanistic tradition, and so the idea that you would create a piece of art and no one would ever look at it and just throw it away and like, okay, now it's dead, is sad. yeah So I assume engineers can feel the same way about the things that they produce.
00:24:38
Speaker
Yeah, I 100% agree. Yeah, it would be real shame to spend hours, days, months, years, hopefully not years, but sometimes years, creating stuff that never sees the light of day.
00:24:48
Speaker
And that that's the stake. if if If you can't trust it, you can't use it. Full sap. Speaking of that, all right, so we just talked about the the communicator side. As the audience side, yeah imagining that you know your your boss told you to check the library, see if we can use this code,
00:25:03
Speaker
What signs are you looking for to see if you trust this code? And what signs that if you see, you're thinking, I might not use this. but That's an excellent question.

Trust and Reliability Through Comments

00:25:14
Speaker
The very first thing I'm checking to see is if somebody has been willing to stake their claim on it.
00:25:19
Speaker
Is there attribution at all? Did somebody actually in their head say written by so and so? thats That's a big line to cross. I mean, yet if you are willing to put your name on something, your reputation is now on the line. And if it sucks, the and the interpretation is that you now suck.
00:25:35
Speaker
So that that's a big one. I look for that. I really do. I also obviously look for descriptive header information so that I know what's going on. I look to see that if I do have questions about what's going on in this code as I'm reading it, that I feel like the person who wrote it has provided a roadmap.
00:25:53
Speaker
Now, you can open any file in MATLAB, any function. just open and write the function name and it will open up the m file that is that code. It's not like these are hidden things. And you can see from those descriptions who wrote it, where it's sourced from,
00:26:11
Speaker
and what goes in, what comes out. these are When you type in help in MATLAB and then the name of a function, that header is what comes up. So clearly this is an important thing. It's kind of like why we use MATLAB, why people have loved MATLAB if they do, is because the help file is so well documented. Well, the help is just a bunch of code headers that people written and they're comments.
00:26:31
Speaker
If that isn't there, I'm not using it. That's crazy. I probably couldn't even figure it out myself. But then the other thing is, you will open those files and see that they're somewhat sparsely commented in line.
00:26:42
Speaker
And that's a bit of a red flag, but sometimes you don't have a choice there. I prefer to use codes that are well commented in line as well, but as long as the header is there, I'm good. As long as it's attributed, I'm good.
00:26:54
Speaker
Also, if there is changelog information, I feel much better. Now, that changelog information might not actually be in the file itself, especially if you're using product data management software, or if you're using Git, for instance.
00:27:08
Speaker
A lot of those changelogs can be in the commit statements for Git repo, or it could be in whatever commenting system or note system exists in this PDM software.
00:27:20
Speaker
But regardless, I want to know that if it has been touched by many people, that it has been documented how it's been touched by many people, not just Wild West approach of the last person who screamed loudest gets to publish the code.
00:27:32
Speaker
These are all very important things to me. And if those aren't if those don't exist, I'm either scavenging that code and rewriting it myself and verifying it, or I'm ignoring the method entirely because I don't have time to find how this thing works.
00:27:45
Speaker
Right. And it's just too risky to yeah it's too risky import something you don't know. There's just not enough benefit. Like, I'm not going to get fired for overriding something important. That ain't me.
00:27:56
Speaker
Yeah, and a real persistence in engineering communication, just how important it is to establish your ethos and your trustworthiness in the genre that is being used at that moment. Right. Trustworthiness really shines here as an important word when it comes to code commenting.
00:28:14
Speaker
That word in particular, I think, lifts a lot. when it comes to the value of comments. value The value of comments is it it has a strong tie-in to actionability and in many ways at tie-in to intuitiveness, which are all very important things in engineering communication.
00:28:31
Speaker
But trustworthiness is really what I think lies at the heart of the value of writing good comments. You want to be seen as a trustworthy creator of product. And without code comments, it is very hard to come across as trustworthy because there's nothing to trust.
00:28:49
Speaker
And just listening to some of your responses to the scenario I've kind of cooked up, it sounds as if is your comments aren't trustworthy and aren't usable, then your code's not usable.
00:29:02
Speaker
Yeah, I would say that's pretty that's pretty self-evident in the fact that I wouldn't use it, right? Like, I wouldn't do it. If I'm unwilling to take this leap, um and any person who is looking at code is unwilling to take this leap, I mean, let's put it back into the minds of a student.
00:29:19
Speaker
If a student is looking up on Math Central or Stack Exchange or some sort of forum, God forbid Reddit, and somebody has written some code and they're trying to evaluate whether or not they can use this code in their analysis, they're also not going to trust it if they don't see documentation. Because quite frankly, a novice coder isn't going to be able to understand what's happening inside of these larger codes. And their debugging skills aren't all right honed enough to be able to determine its trustworthiness on their own.
00:29:49
Speaker
There's a whole way to take a piece of code that has no comments and determine what is coming out of it and its usefulness, but that takes a lot of skill in the realm of debugging code and test cases and an entire framework of thought around validation and and determining appropriateness that like this is not a novice concept. So a novice coder without the commenting won't be able to do that back end. And quite frankly, I think most students would be hard pressed to take that piece of code and trust it to try and earn an A in a class project, let alone do something in an internship, let alone stake their professional reputation on it.
00:30:27
Speaker
Yeah. I've heard this axiom that no comment is better than a bad comment. Is that true? It depends on how bad the comment, I would say. I think that there are certainly degrees to that. The whole concept of update counter by one, that's a harmless comment, right? Be slightly annoying or even distracting. Right. Right, but if certainly if comment says update counter by two and what you're really doing is updating counter by one, yes, in that case you have done a very bad thing with that comment. You have attempted to say a thing which is now not true. And by the way, it might not be a malicious act.
00:31:06
Speaker
You may have written the comment when the code was one way and now you're moving quickly. Now you're moving fast. You are trying to fix this code for something else. You're like, well, I'm going to fix it all right now. And then we go back through and I'm going to comment it.
00:31:17
Speaker
So you leave the comments in place for this current iteration of code. And now you you realize that you should have been indexing by two because you want to, I don't know, hit every even number instead of every odd number or whatever, or every number. And so you you perform this small change and now your code works and you're excited. You save that and then you go back to your your overall script that you're using to do this analysis and now this script is giving you data and you write that data up and you're very excited and you have just committed a code sin.
00:31:43
Speaker
which is you didn't comment your changes as you were making them. And you you you were so excited about the fact that your code now runs that you left this incorrect comment in place.
00:31:53
Speaker
And if now you're ah a mature grad student, you're a senior grad student, you've written your dissertation, you have this lovely data, all these amazing plots, good for you. And you hand this code off to the next iteration of you, the young neophyte researcher, right? The first year PhD student.
00:32:11
Speaker
And the first year PhD student, their first act is to try and read through your code and figure out what the hell you did. Because they can read your thesis, but that's in a totally different language to them. They know MATLAB at least. They know Python at least. So they go into your code.
00:32:23
Speaker
And as they're going through these things, they are not replicating your results. Why are these results not replicable? What are they gonna do? They're gonna go into each of the functions. They're gonna debug your code.
00:32:34
Speaker
And in your code, it says index by two. And they trust that. And many times they might trust that instead of actually going through and figuring out what that bit of for logic is doing.
00:32:48
Speaker
Now, again, this is a pretty obvious example, but there are many cases in which the actual counter update might be buried in other pieces of code, or in some languages, it's just a very archaic operation at all.
00:33:00
Speaker
In C, for instance, like if you know how to read C, you'll see it, but just how you would index an the I++ plus plus command or whatever in C, i mean, if you don't really know what you're reading, there you have no shot at understanding what that means.
00:33:13
Speaker
And then you comment it incorrectly, That person could spend a day trying to track that down. And anybody who has interacted or debugged with code knows how frustrating it is when it's a simple tiny thing, a hanging parentheses, a simple semicolon, a white space indexing problem that causes you to not get what you want. When it's like not the actual model or the math or the physical stuff that you're trying to do and it's something stupid, you think of it as a stupid mistake and you missed it,
00:33:46
Speaker
You spent a day on it because of some stupid comment that you wrote or someone else wrote. Now you're mad at that person. You carry that anger in your heart for the rest of your life. The rest of your life? Yes. i there i have grudges. I have professional grudges against people who wrote codes that I

Balancing Clarity and Efficiency

00:34:01
Speaker
inherited. i mean, i can I can see that if I was going to spend all day long scouring through code to find one little mistake, it would stick with me. It's already difficult. and then when it's frustrating on top of that, there's no bueno.
00:34:14
Speaker
So speaking of Reddit, I checked out some subreddits on code commenting to prepare for our discussion today. Oh, boy. ah And as as you might expect, there's a lot of students talking about code commenting there. And there was a meme that I saw, which I assume you've seen as well. And they said, um this is what code commenting is like. And it was a picture of a cat with a sticky note that said cat on its forehead. Huh.
00:34:40
Speaker
Yeah. So. Yeah. I think that you know if you're the student and you're really interested in coding and code commenting takes a long time, it can be annoying to do it. But as we've been talking about, it's really important.
00:34:58
Speaker
On the other hand, is it ever redundant?
00:35:02
Speaker
It could be. And the stakes are high enough to where if it is redundant, that's okay. Okay. Okay. i it's better It's one of those better safe than sorry type of things. yeah I think that redundant comments can be solved by not being redundant. It sounds a little bit like a truism, and it is it the way I phrased it, but there are ways to comment that something is a cat and not suffer the ignominious memeing.
00:35:31
Speaker
um I think that goes back to understanding what the future user of the code is going to want to do and the situation. It's about informing decision makers. Is this person going to want to run code that's going to get you to the...
00:35:47
Speaker
smallest decimal point or are they going to want to run code that is going to run the fastest sure you can all of this is stuff that comes in section header level and function header level code ah commenting and it's it's really about intent right in many ways your your comments signal intent And that is contextual information that another coder might not have, and it is very necessary when it comes to interacting with your code.
00:36:19
Speaker
So if what you're trying to do is, again, just describe the serial operations in lines of code, yeah, you're going to get cat and with a cat sticky note. But if what you're trying to say is this code identifies which type of cat, and whether it's male or female,
00:36:37
Speaker
and whether this cat likes belly rubs. you know That's important information. and it's also a lovely cat, if the answer is yes. but It's pretty rare, too. Pretty rare cat, if the answer is yes. But I have owned that cat, and that cat is lovely. The um the the the reality of of what I'm trying to say is that there are things that are important to comment, not just how the code operates, but what the code was supposed to do.
00:37:00
Speaker
And if you do capture that, and you leave out the line by line comments, you are actually still in a perfectly viable and worthwhile place when it comes to communicating what your code does.
00:37:13
Speaker
And so this is maybe the whole, if you're talking about no comments are better than bad comments, again, as long as the intent of the code comes through pretty clearly,
00:37:23
Speaker
And there is a one-to-one relationship between what you say it does and what the code actually does. And you can communicate exactly what operations are being performed in the at the more meta level.
00:37:35
Speaker
Like i this data is turned into frequency domain data via the Fourier transform. You can set these properties of the Fourier transform and the output should be an n by one vector.
00:37:46
Speaker
This is all really good stuff. But if you go in there and say counter updated by one, yeah, you could leave that out. that where That is cat equals cat. You teach our numerical methods course here i do in the Woodrow School.
00:38:00
Speaker
What are some things that you're really wanting to see from students when they do co-commenting in your class? What I want to see from them is that they have any concept of what they're doing.
00:38:11
Speaker
Okay. And really, it is very easy to find bits of pre-written code.

Role of AI and Human Oversight

00:38:19
Speaker
In fact, it is almost ubiquitous now that most of the code you're going to find on the internet is probably generated by some sort of large language learning model.
00:38:31
Speaker
If you are writing a bunch of code that you're getting help with from the internet, or that maybe you are getting from ChatGPT or Microsoft Copilot or Gemini or whatever it is, I want to see some some comments that show me that you know what this code is supposed to be doing, not necessarily just the comments that these models are creating. Now, honestly...
00:38:50
Speaker
i can't I can't throw too much shade because those AI models actually write some pretty decent comments. they They are very, very, very susceptible to cat equals cat, but they also do sometimes a very good job of creating a rough header for you that then you can provide, add more context to.
00:39:08
Speaker
And if there is a specific problem solving context, like it does a good job of writing the code for this numerical integrator, but you're specifically writing this with some salient features for, I don't know, processing accelerometer data.
00:39:21
Speaker
then you can go in and add the specifics for the accelerometer data. And now you have actually a really nice framework. And I think that maybe what that can be summarized as is I want to know that a student has interacted with this code in some way.
00:39:38
Speaker
it's It's really not that important anymore to write the code yourself. And I'll probably get crucified for this. But I really think that the future of coding will be natural language instructor to language models.
00:39:52
Speaker
Whether those are large hyperscale language models or not is completely different. But you this is a useful way to write code. It is 90% correct. I think, though, and what you talked about here is that In this move, the necessity for code comments to be highly trustworthy is not going anywhere. No, not at all. I mean, if anything, you have to police the code more the the code comments more than the code itself.
00:40:20
Speaker
Because oftentimes these language models, these AI models, are going to write good code, but it could come out really weird. It does what you want to do, and you may have the whole cargo cult thing coming back. There could be bugs in there.
00:40:31
Speaker
But as long as you can describe what the inputs, outputs are and and the use case for this thing, you can kind of take a lot of that and at least make it usable. Maybe a little dangerous. You do want to to battle test it a little bit still.
00:40:45
Speaker
But yeah, if you look at the comments that this code generates and they're not terribly useful, you can still make this chat GPT generated code useful by making the comments useful.
00:40:57
Speaker
I don't know if I said that 100% the way I wanted to say it, but I think that the message is still pretty clear that the the act of creating with code now is simpler than ever.
00:41:09
Speaker
In fact, if you think about creation as the ultimate expression of engagement with material, that's not really the level that requires the ultimate engagement anymore. It's not just you go from like knowing to, I don't know, understanding to applying to analyzing to evaluating to creating, right? The classic Bloom's taxonomy structure.
00:41:28
Speaker
you go You can come in at the creating level super easily just by natural language prompting these large language models. So if creating is minimal effort, where does the engineer make their money?
00:41:41
Speaker
They make it at the analysis level. Right. How do we make this profession something that can't be coded away? Yeah, really. And that's where the value is going to be from now until infinity. Anything that that can be done for you by machines needs to be interpreted and made trustworthy by humans.
00:41:58
Speaker
And that is essentially what the comments are doing, is they are interpreting this code and making it trustworthy. After your class, students will then probably do coding again with code comments in Capstone.
00:42:12
Speaker
What advice would you give them at the end of your class for keeping those skills fresh and doing well when moving from an individual assignment like you do in numerical methods to a team assignment like we do in Capstone?
00:42:25
Speaker
The commenting behaviors that you practice when no one is looking are the same commenting behaviors you will practice when everyone is looking.

Impact of Consistent Practices

00:42:34
Speaker
I mean, you can apply this to pretty much any skill-based domain.
00:42:41
Speaker
And it's it's doubly true in commenting because, again, it feels so silly to write something that will never be executed by computer in the domain of computer programming.
00:42:53
Speaker
But it's not there for the computers. It's there for your group mates. It's there for you a week after you wrote it. Or for your sponsor once the project ends. If they're going to use your code and send it into prod, like, you know you know they're sending it into production and that the person who is in charge of taking your student code, they will take this code and they will have to interpret it. And they're going to do exactly what we talked about earlier, which is take this unknown code from some people who you have to assume are either malicious or yahoos,
00:43:22
Speaker
Because that's the only way to safely exactly keep your job. Yeah, and so they're going to give it the same fine-tooth comb analysis that you would have done in their shoes. And they're going to use your comments as the basis for verifying your code. they're going to They're going to check your code against what you've written in the comments, and if it all seems...
00:43:43
Speaker
to be above board, then they will start to trust those comments a little bit more and maybe trust your code as well. So there's that. But the skill that I want my students to understand they need to keep forward is that it is a practicable thing. This is a muscle to be worked out.
00:43:59
Speaker
And there is unfortunately the whole saying of practice makes perfect. That's completely wrong. Perfect practice makes perfect, right? What practice does is make permanent. So if you want to make very good commenting part of your practice, then you have to practice making very good comments over and over and over again. And that means in the face of the overwhelming demands on your time, this thing which, again, seems tertiary in importance, which is actually the most important thing when it comes to the use of this code outside of your own little world, your own little fishbowl, you have to keep on top of that because...
00:44:36
Speaker
We talked about what happens if you forget to comment and change comments in the moment and you get sloppy with your commenting. All of a sudden you have these inconsistencies. And now if I am looking at that code and evaluating its trustworthiness, I delete it and I write it myself.
00:44:50
Speaker
And that's it. So those are the stakes. The other thing is you don't want to be that person who drives your group mates insane. They should be able to look at your code and know what it does too. At a minimum, please, for the love of God, write good function headers.
00:45:03
Speaker
If you're not going to comment in line, write good function headers. so At the basic lowest bar, put your name on it. Put your name on it. So that people can ask you questions. Yeah, that you could you can serve as your interpreter, i but this will be a fun moment.
00:45:19
Speaker
You'll put your name on a piece of code. Two years later, someone's going to ask you what you wrote. You have no idea. And why? Because you didn't comment it. Happened to me. It's happened to me several times. Oh yeah? Oh yeah.
00:45:32
Speaker
And there's an egg all over my face. What do you do? Do you go read through it line by line and try to figure out what your thought process was? Or do you start over? i hope that I have some sort of weird shape memory in my brain when upon reading it that I'm like, oh, I remember what this was.
00:45:48
Speaker
I hope that I have that moment. And if I don't, then yeah, either I feel personally responsible for going back through and doing a post-mortem on this crap that I wrote before Or I would just chuck it out and say, here's what I intended to do.
00:46:02
Speaker
And I would recommend taking a Fourier transform and doing these operations to it. Yeah. It's got to hurt. It does. It does hurt. It hurts my pride more than anything. And it also hurts their timeline. Yeah.
00:46:14
Speaker
Well, hopefully students can learn from our pain rather than have to experience their own. Isn't that what we want for all of our students slash children? Yes. but Any other advice that you have for students, engineers in general, about co-commenting or about understanding how communication is wrapped around this, what can seem as very isolating and just a

Communication in Engineering

00:46:40
Speaker
numerical process? Yeah.
00:46:42
Speaker
I just want you to remember that there's nothing in engineering that doesn't get touched by communication. it's it's There's no such thing as a right answer that is discovered by code. There's no such thing as a correct way to do an analysis without the the act of justifying it both on the front end in terms of assumptions going in and on the back end in terms of its appropriateness.
00:47:10
Speaker
You have to do these things. this is not This is not a domain where there is just a thing that is right and a thing that is wrong at any useful problem scale. This may be true about math problems, but engineering isn't just math problems. Engineering is solving people's problems using math and models and physics and all this stuff, and it all is very complicated. It's all very hand-wavy, right? These models are not wrong. This is the whole some models are... but All models are wrong, but some are useful kind of approach, and that's very true.
00:47:38
Speaker
And all of it comes together into this big soup, which we call an analysis code. But the analysis code's usefulness only extends as far as your ability to convince somebody that it is useful and trustworthy and in many ways intuitive.
00:47:55
Speaker
And if you don't do that through your code commenting, it's not going to happen. ah Running a piece of code in MATLAB and seeing a plot pop out of it is not self-standing. It's not a stable way of analyzing the trustworthiness of this code or this result.
00:48:12
Speaker
It's got to be explained by a human being. It's got to be analyzed and defended and evaluated and has to to withstand the crucible. Contextualized. Yeah, contextualized. It has to be interrogated.
00:48:25
Speaker
And all of that stuff is the stuff that is handled by comments. It's not the code itself. So this is a survival trait. It's also your job security.
00:48:36
Speaker
right So take it seriously. LLMs can write code, but they're not very good at contextualizing why this code is ideal for you right now. It'll just lie to you about that. I mean, it's it at some point, LLMs have to lie. We could have a whole conversation about what LLMs are really doing, but at some point, they will hallucinate.
00:48:58
Speaker
And if they're if they hallucinate at the wrong time about the wrong thing, this is an extreme problem, and that's where you step in. That's where the engineer makes their money in the modern... Problem-solving landscape is making sure that the calling bullshit on LLMs.
00:49:12
Speaker
Yeah. Math is math. Physics is physics. Engineering is human. It's about human problems. Right. and And if there's no right answer, then it's all about the interpretation of your answer.

Final Thoughts and Encouragement

00:49:26
Speaker
And that's communication. And that's what coding, commenting is all about, is interpretation of your code. So there are a million other little kind of axiomatic pieces of wisdom about how to structure things, how to name variables, all these really...
00:49:42
Speaker
interesting kind of mini practices within the practice of commenting. But all of that is just splitting hairs. And you can find style guides on that kind of stuff. You can find threads on Reddit.
00:49:53
Speaker
Or even once you are employed, your employer might have ah a particular way that they like it done. Yeah, they might differ between bosses. There might be a style guide somewhere. But for the most part, you, first of all, need to find all that stuff out the hard way, which is making mistakes and having somebody correct you.
00:50:09
Speaker
But you have to do it. Because if you don't do it, again, your boss will be mad and you'll be fired. And also, no one will trust any of your work. And your code will die or be left on the shelf. it's yeah Again, it's sad to think that you spent time building this thing.
00:50:26
Speaker
And just like an a an artist would build something. It's like making a cake that's never eaten. yeah That sucks. you know and And it doesn't feel good for anybody. People who don't get to eat the cake and the baker.
00:50:39
Speaker
Well, thank you so much for being with us today, David. Yeah, thank you, Jill. I really enjoyed this, and I hope that everybody writes comments for their code. That's it. That's all I want. Please.