Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
7| Anna Lewis — Genomics, polygenic risk scores, genetic ancestry, race & ethics image

7| Anna Lewis — Genomics, polygenic risk scores, genetic ancestry, race & ethics

MULTIVERSES
Avatar
161 Plays1 year ago

Genomics is leading a revolution in our understanding of disease. But the ways we pursue genomics research and the use we make of that knowledge demand careful thinking.

Anna is a researcher at The Edmond & Lily Safra Center for Ethics at Harvard, she holds a PhD in Systems Biology from Oxford (where we met) and has worked in medtech startups. As someone who has looked at genomics from multiple perspectives, she’s an excellent guide to this rocky terrain.

Anna emphasizes the challenges and importance of polygenic traits and Polygenic Risk Scores (PRS). While they are key tools in understanding and predicting traits, they are subject to misinterpretation and misuse if not properly defined. The concept of 'race' and more recently ‘continental ancestry group’ often used in the calculation of PRSs can lead to misguided or even harmful assumptions, potentially propagating racist ideologies. Instead, Anna suggests the use of Ancestral Recombination Graphs (ARG) to better represent an individual's genetic ancestry.

Through ARG, we can achieve a more scientifically accurate and ethically sound basis for research. As we continue to make leaps in genomics and potentially influence traits like intelligence or strength, the importance of ethical, legal, and social implications becomes increasingly crucial. As we learn to wield our scientific tools, we need to understand how we should use them.

Recommended
Transcript

Introduction to Multiverses

00:00:00
Speaker
Hello, you're listening to Multiverses. And no, this is not about the Marvel comics. It is about marveling at the wonders of the universe and human progress. Two

The Human Genome Project's Impact

00:00:11
Speaker
decades ago, in 2003, the Human Genome Project announced that they had sequenced the human genome. The clue's in the name. This meant they'd written down the set of base pairs, billions of base pairs, that constitute the human genetic code.
00:00:27
Speaker
Since then, the price of doing this has fallen for about $3 billion at that time to just a few hundred dollars. It's fallen by a staggering 10 million fold. Despite this enormous technological progress, there remain huge holes in our knowledge and our understanding of how the genome works.

Challenges in Genomic Understanding

00:00:49
Speaker
It's as if we've transcribed the Book of Life, but we don't know how to read it.
00:00:54
Speaker
Our guest this week is Anna Lewis. Anna is a researcher at the Edmund J. Safra Center for Ethics at Harvard. She's going to talk a little bit about why it's been hard to make progress in understanding how genes translate into traits, and also her particular area of interest, which is about understanding what we should do with that knowledge when we have it, how should we use it, and also how should we go about acquiring that knowledge. So these are the ethical questions.
00:01:23
Speaker
particular look out for her thoughts on genetic ancestry. This is a notion that's really crucial for developing accurate polygenic risk scores, which are something that Anna will explain, but they're essentially useful medical tools which explain to individuals how likely they are to develop particular traits. And we need something like genetic ancestry to make those scores as accurate as possible.

Genomics and Societal Misconceptions

00:01:47
Speaker
But if we misconstrue genetic ancestry, if we
00:01:50
Speaker
think of it as simple continental groupings of people, then we risk making race seem like something that's scientifically meaningful when it's not. And that will do a disservice to the science. It won't help that because it's simply not salient. And it will also do a great disservice and harm indeed to society. And it will introduce us to another way of understanding genetic ancestry, the ancestral recombination graph, which can get this right for both
00:02:20
Speaker
scientific research and in ethical terms. Anna is an old friend of mine. In fact, we started studying together the same year that the genome was first sequenced. So I rather selfishly take this as an opportunity to ask her lots of questions about her career, how she went from PhD in systems biology to working in the genomics industry, and now to working in the ethics of genomics. I hope you enjoy this conversation as much as I did. I'm James Robinson. This is Multiverses.
00:02:47
Speaker
you
00:03:01
Speaker
Anna, Lewis, thank you for joining me on Multiverses. Oh, it's a great pleasure to be here. This is very exciting because I'm getting to use, for the first time, the bidirectional setting on my microphone because you're actually sitting just across from me. Yeah. So yeah, thanks for being here in person as well. Oh, thanks for having me.
00:03:19
Speaker
So we studied together, in fact, I think we met about 20 years ago, and then four years after that, after studying physics and philosophy, I remember asking you, oh, what is it that you're gonna do next? And you said, oh, I'm doing a PhD in systems biology. And my question then, and it's a question I'm gonna ask again is, well, what is systems biology?
00:03:47
Speaker
Yeah, well, it's interesting because I feel it was a buzz term at the time and it's like the cynical view might be it was whatever helped you to get grants at the time.

Systems Biology vs. Reductionism

00:03:59
Speaker
But I got into that and this is a roundabout way of answering your question through a seminar series that I attended whilst we were undergrads on the conceptual foundations of systems biology, which is run by this guy called Dennis Noble, who is
00:04:17
Speaker
um, well into his eighties by now. And he was one of the first people to mathematically model the heart back on those huge computers that took up a whole room. And he's sort of seen as one of the founders of systems biology. And he ran this seminar series cause he was trying to get in this question of what, what is it that we're doing and what is it that characterizes the type of science that we're doing? Um, so it's certainly a science which, um,
00:04:44
Speaker
tries to iterate between theory and experiment and that tries to model biological systems at multiple levels. So to understand how cellular function emerges from interactions between proteins, for example.
00:05:03
Speaker
or how systems level function emerges from kind of the details, potentially the mechanical and electrical details of what the tissue is up to in the heart in his case. So it's trying to sort of build up this holistic understanding of what's going on. And it's directly counter to what Dennis and various others saw as the dominant
00:05:32
Speaker
philosophy in biology of reductionism, which was working out what the biological parts were. So that's, you know, human genome project and then identifying what the genes were. And you get this long parts list, which it turns out doesn't really advance your understanding that much. You need to get to, well, how did those pieces interact together?
00:06:01
Speaker
sort of add up to more than the sum of the parts and that's really I think what systems biology was about. And then you get some interesting conceptual issues when for example you're using the language of causation and like some people start talking about downward causation and that type of thing and it all got very murky and suspicious but
00:06:24
Speaker
Yeah, that's sort of the best answer I can give, but I do think that the place to actually go would be to see who is still offering grant money and how are they defining those pots of money.
00:06:38
Speaker
Yeah, so I guess, yeah, it's striking, you mentioned the word emergence and it is fascinating how some of the components in biology seem really simple and pretty well understood. But then when you start to put them together, you find that you get incredible complexity of behavior coming out. Was this something that attracted you? I remember you talking about complexity at the time as well.
00:07:06
Speaker
Yeah, for sure. I think, yeah, one of my favorite kind of anecdotes is that when they were doing the human genome project, they had a sweepstake for the number of genes they thought they would find in the human genome.
00:07:20
Speaker
And nobody guessed as few as were actually found, right? So everybody that's roughly like somewhere between 20 and 25,000, depending on how you count, it's actually not a very well-defined term. But everybody had guessed way north of that number. And then, yeah, so somehow you've got a smaller number of parts than you thought, but it's both the case that the parts themselves are more complicated and more interesting than we thought they were. And then there's just so many
00:07:50
Speaker
ways that they interact to bring about function. Yeah.

Complexity and Evolution in Biology

00:07:56
Speaker
So it's a very, I remember, I remember that the moment I sort of fell out of love with theoretical physics was sitting in our second year quantum mechanics lectures and we just modeled the hydrogen atom and that was very beautiful. We got onto the helium atom, which isn't a very exciting system.
00:08:16
Speaker
Um, and you were already having to make what seems like kind of ugly approximations and assumptions to really model it using the tools that you had available. So if that was what was happening with the helium atom, then what hope for like more genuinely interesting systems. Um, yeah. And we still don't have, there's so many choices about how you model these systems and it depends what you want, what kind of understanding you want to get out of them. But that seemed to me more interesting.
00:08:45
Speaker
And yeah, I think on the other hand, like systems biology, I think maybe draw some inspiration from physics in that it is about modeling in a way that maybe just kind of traditional biology is more about, I don't know, observation and doesn't really have much room for the use of mathematical tools. I think that's right. Yeah. I mean, biology has this, has this superpower of, of evolution and, um, whose crit was it that like,
00:09:16
Speaker
in the absence of evolution and the kind of explanatory framework that
00:09:20
Speaker
but that's giving you everything else in biology as stamp collecting. I think it was Rutherford maybe. I think he said like, there's, I think he said that everything outside of physics was stamp collecting. Oh, I see, okay. He certainly caught biology in that, but yeah, I think it was physics and stamp collecting. But certainly, I mean, you can see how that fits with Darwin. He was like a stamp collector, but his stamps were like little fossils and things, right? Right. But he also was really trying to look for the patterns behind that. Yeah.
00:09:49
Speaker
Yeah. He categorized those stamps in a really explanatory way. And it's just phenomenal how explanatory it is. Um, yeah, nothing in biology or life makes sense except in the light of evolution. Um, it's a very beautiful, very, very beautiful framework. Um, yeah, but it's, but, but I, I, I agree with you that it is, it is a science that's really trying to,
00:10:18
Speaker
to model stuff and see where that gets you. Yeah, yeah. Just on, completely off topic, but I know there are rather thirds errors that I came across recently. Yeah.
00:10:31
Speaker
tangential is he's just before that the day before that Leo Szilard came up with the The way that you could create a nuclear chain reaction and like harness nuclear energy And in fact inspired by something that Rutherford said. Rutherford said, oh, we'll never be able to do this like this is just like a fool's errand We'll never be able to get all this amazing quantities of energy that's stored in mass and
00:10:55
Speaker
I've read that apparently. And you're just like, oh, maybe you can. Within like 24 hours. Challenge accepted. Exactly. I think it's good sometimes to have these throw down statements and like, prove me wrong. So yeah, prove that it's not just stamp collecting. And I think systems biology is certainly one of the ways that that challenge is being taken up in biology. Yeah.
00:11:23
Speaker
I think that's right. And these days people like to throw, or at least what they like to call their AI tools at it, which often is less, less explanatory. And yeah, one of the things I like about systems biology is, you know, it makes it very clear what your models are, which is something that I think we sometimes lose. Um, yeah. So you should always be clear when you're doing science, like what are the set of assumptions that you are making in order to derive your results and then
00:11:53
Speaker
need to remember what those were and I think we often forget those. Maybe we could see one kind of doubt that enters my mind about just the prospects for these kind of ways of studying things is just emergence seems fundamentally complex and hard to model and we were talking earlier just offline about how
00:12:17
Speaker
Stephen Wolfram, for instance, is saying, well, everything is just models. But on the other hand, the models sort of acquire this kind of complexity that at some point you lose track of the explanatory power of the model, perhaps. And maybe if we can think through, if you have an example that comes to mind of where this is really hard, where it just seems fundamentally difficult with
00:12:46
Speaker
to model something and actually get explanatory power back because the simplifications of the model break the predictive power or where it actually works. Putting you on the spot here. Yeah. Well, the first thing, this isn't exactly what you're after, but I think the thing with these complex systems is it's very easy to have an oversimplified story that you then think applies to various things that you see. So I remember learning about
00:13:17
Speaker
stripes that form in various animals and you can have a really nice reaction diffusion equation that's happening in the embryo which will generate these stripes for you and then for you know for a long time I can't remember which model system it was in it was like oh yeah this is how we get the stripes and indeed I think there are some systems or some organisms in which that's the case but then you get the same thing the same sort of stripy pattern forming during development for lots of other
00:13:46
Speaker
kind of mechanisms as well. So it's really easy to think, oh yeah, we've explained this phenomenon. And then think that applies to, in this case, all of life, when actually it applies to this small little section of the tree of life. And yeah, you just can't generalize as much as you thought you could. I think that's one of the big dangers.
00:14:15
Speaker
Yeah. So partly there's a kind of lesson in, I don't know, maybe not humility, but being cautious here from, and actually one of the things that systems biology is telling us is this is hard, right? And we have to, we have to appreciate that sometimes it may look that like, you know, the data fits our model, but that doesn't mean the model is correct. Right.
00:14:42
Speaker
Right. And I think something that happens, something that's happened several times in biology is that physicists have kind of really turned to biology. So that happened during the molecular biology, kind of early days thereof. So Schrodinger famously writes this book, What is Life, where he's really sort of hypothesizing about what the genetic material might be. This is pre Watson and Crick.
00:15:08
Speaker
And he was part of a sort of wave of biologists that go towards biology. And I think that was happening again, kind of in the, I think it happens continuously, but a lot of physicists were sort of taking their toolkit and hoping that it would be applicable to biology.
00:15:27
Speaker
and maybe doing some oversimplifications along the way. So I do think that a lot of humility is needed. Yeah. I think, yeah, the Schrodinger example is a good one. And we've had a lot of episodes on quantum mechanics. It's really interesting to see. I mean, he was curious about your opinion on his book, on what is life.
00:15:53
Speaker
It seems like there's some very good insights in there. The one that sticks with me is that you can't explain the kind of structural stability of chemistry that you need to sustain the transmission of characteristics through generations without quantum mechanics.
00:16:18
Speaker
you need, but you also need that stuff to be flexible enough that characteristics can change. And he has this kind of insight that things have to be very kind of carefully balanced before he, you know, as you say, before DNA was discovered. He was like, well, there's got to be some code here. That's pretty stable, but not completely frozen, I guess. Yeah. Yeah. And that's exactly right. And that's like a kind of parameter actually, I guess it's like,
00:16:43
Speaker
You know, it's a more complicated space than the unidimensional space, but a parameter that evolution tunes, right? It's like how, how much mutations you get each generation.
00:16:55
Speaker
But yeah, I think, I mean, it's a very long time since I read the book, but I think he uses the term a periodic crystal, which is not too far off, right? Like there was definitely something in that, but I think his main contribution was to inspire a bunch of physicists to go into biology. Yeah. Yeah. I guess he was not one of your inspirations though, that was like more...
00:17:26
Speaker
Yeah, I definitely credit Dennis Noble to a large extent. And he's got some very accessible books where he sort of saw himself as the counterpoint to Dawkins. So Dawkins, these two old white guys based in Oxford. And Dawkins is very associated with this reductionist view.
00:17:52
Speaker
and Dennis was trying to sort of build the counter culture against that. I remember sitting with, I'm sure you were there, but I remember being in a room in Bailil, because both Dawkins and Dennis Noble had some kind of associations with Bailil, I think. Dennis Noble was at Bailil at the time, Dawkins had been, and they had these
00:18:17
Speaker
very different views. I remember Dennis Noble reading an extract from the selfless gene where Dawkins says, oh, all the genes are interested in is propagating themselves and doing this and they'll use humans to, and like, you know, emergent structures, I suppose, to kind of fulfill this for them. And then he just
00:18:40
Speaker
you know, went through it line by line and reversed the meaning of every kind of like, I guess, intentional verb. So it was like, you know, oh, well, all the structures, humans are just like exploiting their genes so that they can get the things that they care about done. And, you know, it was, you know, I suppose going to the falsification and
00:19:06
Speaker
or false viability of this theory and saying, you know, is there really content in this assertion or scientific content in this assertion? Right. Right. It's modeling. It's modeling all the way down. Like, yeah, his point was, and I forgotten about that, but thank you for reminding me. His point was that in that whole paragraph, which is a very sort of central paragraph to the selfish gene, there's almost no empirical statements. I think there was like, you know,
00:19:34
Speaker
the genes are inside us or something, which was perhaps the only thing which would qualify. And then all the rest was kind of metaphor. Yeah. And it can be very useful metaphor. And, you know, I think it has prompted the sort of selfish gene model has prompted people to go off and look for new empirical data. But
00:20:00
Speaker
I remember asking Dawkins in some talk he was given about the extent to which he viewed what he was doing as building up these models and metaphors. And his answer was, you know, not very much. Yeah, but I think it is. It's like, you know, it's a way of modeling what's going on. And the danger is when we think that there's like one true way to look at reality.
00:20:24
Speaker
Yeah. Well,

Modeling Reality in Genetic Science

00:20:25
Speaker
I think it's natural often for us to conflate models with reality. Like if a model is really successful, then
00:20:36
Speaker
yeah like you kind of take the elements of that model and you say that stuff exists but I suppose like the issue is with the kind of genetic model is it's not telling us anything about like it doesn't have desires built into it right that's just like something that's kind of grafted on I mean you can view it as part of Dawkins's model but it's not a fundamental aspect of kind of
00:21:02
Speaker
the science, I suppose. As you say, fundamental is things like the genes are inside us. That is a part of the model. A central dogma of... Right. No, but I would still, maybe I'm using the term more expansively, the term model. It's like a set of ways of thinking about something that would prompt other ways of thinking about the same thing. I would view the whole kind of selfish gene as
00:21:27
Speaker
sort of model or class of models for thinking about biological systems. Yeah. Yeah, that makes sense. Yeah. Yeah.
00:21:36
Speaker
And okay, so systems biology, you did a PhD in that. And I've got to say, whether it was like, you know, whether the money being, you know, following that field was a motivation or not, you certainly picked a time where like just over the past decades, not so much, or not just systems biology, but I guess like the technology around biology,
00:22:06
Speaker
has just like exploded. So yeah, tell us what you did next. Well, I took a minor detour via management consultancy and Australia. It's quite geographically. And then I was actually, I was supposed to be in San Francisco for a weekend, but I actually stayed with
00:22:32
Speaker
uh, physics and philosopher, um, forebear of ours, um, Adam Brown in San Francisco. And I had a really wonderful, wonderful time. I met all these people fired up about life and excited about projects and ideas. Uh, so I phoned up my employer in Australia and told them I wasn't coming back. And then there I was in the States and I needed to get a visa quickly. So I bulked up my coding abilities and I got a job.
00:22:59
Speaker
in a company which was building software to interpret genomes. So it was the case then, so this is 10 years ago now, nearly 10 years ago, and it's even more the case now that people cite the cost of sequencing genetic material. That cost has decreased faster than Moore's Law.
00:23:25
Speaker
But what does that get you? That gets you a whole load of, there's this very, very sort of turnkey software that will then get you a whole load of ACGs and Ts, basically reads, we call them, which you can then compare to what we call the reference genome. And then you look for differences and
00:23:53
Speaker
If you're, if you're looking at a whole genome, so across the 3 billion base pairs of, of our genomes, you'll get out somewhere like dependent depends how you count like three or 4 million points at which any given individual will vary from that reference to, you know, and we call those genetic variants.
00:24:15
Speaker
So to get from the stage where you've gone from a sample of blood or saliva to that list of variants, that price has come down hugely. But that list of variants doesn't actually tell you anything. You need to then interpret them. And so we were building software to do that stage. And there's various different moving parts to it. And there's various different use cases for why you might want to do that.
00:24:45
Speaker
But the central one, you know, was then and continues to be today where you have basically a person, usually a child with a suspected genetic condition and you want to find out what's causing it. Right. And

Advancements and Tools in Genomics

00:25:01
Speaker
the, you know, you have this list of 4 million variants and one or at most two would be the underlying causative variants. If it is indeed,
00:25:15
Speaker
um, what we call a Mendelian genetic disease, which is basically something has gone really wrong with exactly one gene. Um, either, either both copies, both the copy inherited from mum and the copy inherited from dad are funky and that's a recessive condition. Um, which, uh, um, a great many of the
00:25:40
Speaker
8,000 or so rare genetic diseases that we know are, or sometimes you need just one copy of the gene to be to be funky. It's either lost function or it's going function. These are the dominant diseases. And typically, those variants have not been inherited from either mum or dad. They've arrived de novo in
00:26:04
Speaker
more often than not the sperm, sometimes the egg or sometimes early on in embryonic development. So the challenge is to identify those variants and there's all sorts of different information that you can use to try and do that from
00:26:30
Speaker
from sort of evolutionary conservation. You can look at humans compared to other species. If it's a really conserved point and then this individual has something, a mutation there, then that can alert you that maybe that increases the chance that that particular variant might be causative.
00:26:51
Speaker
or if it's a variant which is really, really rare across humans sampled throughout the world, that increases its chances. Often we might have other types of evidence or functional evidence. Somebody has actually done some sort of an experiment and there's various other classes of evidence. And yeah, you need to sort of aggregate those.
00:27:17
Speaker
Uh, yeah. And then you need to try and convince whomever might be, if you think that you found the variant, you sort of add up the evidence and see whether it will cross the bar for what we consider reportable to a family. Um, and a lot of this is, you know, continues to be non-automatable. So you have to give people, give humans the tools to sort of try and integrate these different puzzle pieces.
00:27:46
Speaker
and crack the case and we use that kind of terminology.
00:27:54
Speaker
should be automatable. I mean my sort of like zoomed out view is something like we're trying to read the book of life but we really can't but what we have is lots and lots of copies of that book and we're looking at least for where there's been like an error like in you know where there's a misprint let's say and we we can do that by looking as you say even at other species because we say so much you know so much DNA with them so much of that genome with them
00:28:24
Speaker
Um, or, you know, more particularly at other humans. And then you find it like, is there a single misprint because this is really weird. Everyone else has got this in the text. Um, yeah. Why is that hard to, is it just hard to automate because we don't have all the data in one place or. I mean, one of the reasons is you need a ginormous amount of data. Yeah. So at this stage, hundreds of thousands of humans have been sequenced and a lot of those have been aggregated into databases. Right. But.
00:28:53
Speaker
You know, yeah, some of these variants are just, yeah, they're so, so rare. And actually just getting enough evidence that reaches this threshold of, yeah, that's convincing can be really tough. And particularly before you sort of look at the variant level, you need to have some sort of linkage between the gene
00:29:21
Speaker
and the condition and you need more than one case to do that. Um, so there's, there's a process of matchmaking and there were some really awesome, um, tools and platforms that enable people around the world to do this. They'll be like, okay, I've got a suspected. I think this gene might be implicated in this kid who has this set of phenotypes and then, um,
00:29:48
Speaker
somebody else might have also said, I think that this gene might be implicated in this kid with this set of phenotypes. And the system will be then, yes, match. And then as soon as you've got, yeah, phenotypes is just like descriptions of the condition. And very, very often it'll be like, there'll be many different kind of features of these genetic syndromes because they often will impact a lot of different aspects of
00:30:17
Speaker
of a kid's condition. Not always. Sometimes it would just be like global developmental delay or something fairly general. But if you've got some really specific things, like there's
00:30:36
Speaker
medical geneticists have built up a whole vocabulary for describing all the slightly different ways that facial features can be different, for example. So if you get a good match on some very specific terms like that, that can really increase your confidence that the link between the gene and the condition is a good one. And as soon as you've got like more than one
00:30:58
Speaker
affected child, then you can leverage that. But if you've got a variant, even if it looks really suspicious, but the gene hasn't been implicated in that condition before, you're never going to make it reach this bar, which is reportable to the family. Right.
00:31:16
Speaker
because you do need to be able to sort of sum up this evidence in a convincing way. And of course, there are all sorts of people and groups working on this variant interpretation problem, all sorts of companies who want to throw the latest and coolest AI tools at the problem.
00:31:37
Speaker
So hopefully we'll, you know, we'll see some progress. Yeah, I mean, I think one of the things that you mentioned, and I mean, you did say there'd been, I don't know, a huge decrease in the cost of genomics, but it's almost hard to overstate like how profound that decrease has. I mean, you said it's been faster than Moore's law.
00:31:58
Speaker
I was looking at this, and I think it's seven orders of magnitude or something. If you go from the Human Genome Project, where it was around $3 billion, and they didn't even do it that, well, they did it well. They didn't actually sequence everything at the time. And now it's down to probably $300 about, and will probably be $100 or so next year. Or less even. But who are you going to sequence?
00:32:24
Speaker
Yeah, well, I think when it gets cheap enough and you have enough, you sort of have in the terminology of like startups, like some killer app, right? I think chat GPT has been that killer app for machine learning when there's enormous promote, you know, progress over decades. And, you know, again, also reductions in compute costs, improvements in models, like improvements in the training. And just suddenly it's like crystallized in this one thing. I was like, Oh, I'll get that. Like I'll use that. Yeah.
00:32:54
Speaker
I think 23andMe notwithstanding, I don't think there's been that kind of revolution in genomics in terms of... Right. So what is the killer app? There's all sorts of people that would like very much to answer that question. And the ancestry testing has been surprisingly, at least in my mind,
00:33:15
Speaker
popular and has really driven a lot of people, we're talking millions of people, to, in their case, they're not doing sequencing, they're doing genotyping, which is getting a far smaller set of DNA. It's like looking at 200,000 of the most variable points in our genomes and it's giving what they frame as ancestry information and a lot of people are really interested in that
00:33:44
Speaker
And then there's another subset of people who are really interested in the use case of identifying blood relatives. There's a lot of people who have been adopted who are really interested in that or donor conceived or just estranged from their families and they don't want to be, their biological families that is.
00:34:12
Speaker
So those have been really popular use cases, but the kit sales for those companies have been leveling off. So it seems like they might've tapped most of the market. So people hope that people will
00:34:38
Speaker
get the genome sequenced because it really reveals actionable health information.

Health Implications of Genome Sequencing

00:34:44
Speaker
And there is some actionable information to be had within the genome. I think there's less of it for healthy people than we might think. And when I say less of it, I mean like the type of information that you could then take to your doctor and your doctor would be like, oh yeah, so I'm going to now order you this
00:35:08
Speaker
new screening test or I'm going to put you on this new drug or I'm going to do something different because you share this information with me. There's actually surprisingly little stuff that rises to that level. Is that because
00:35:27
Speaker
there's just other factors that are much more important? Or is it maybe that there's just things that we don't yet know about, you know, the way that genes function? And maybe we find actually, you know, personalized medicine could be really powerful because
00:35:43
Speaker
You know, I don't know, just just the choice of aspirin manufacturer or something that is used like would be, you know, optimized on a personal basis as things as trivial as that. Or is it really just those things as super robust? I think it turns out to be more complicated than we thought it was. So one example is OK, so so so far we've we've been limited to talking about sort of conditions for which single genes have a really large role to play. So let's let's let's like round out that story to where we are today.
00:36:14
Speaker
So most of those discoveries were based on families or individuals within those families who had really strong phenotypes. But it then turns out that if you look at a healthy group of people
00:36:34
Speaker
There were some of them that carry those same genetic variants, but they're not expressing that same phenotype. Right. So that's called penetrance. So a completely penetrant variant would be one that, you know, no matter who had it, they're going to develop that condition. Like Huntington's is actually an example of that. Like if you've got the variant, you're going to get the condition. And that is like, is that something? Yeah.
00:36:59
Speaker
And it just, something that we've been learning, and this is much more recent, is that a lot of the things that we thought were very highly penetrant turn out not to have been. As we've gathered more data of the combination of the genetic data with electronic health record data, we've realized that there's all sorts of individuals walking around with these variants that we thought would lead to like severe genetic disease and they don't.
00:37:25
Speaker
Yeah. And so that's a more complicated picture then. It's complicated, but could it be, I mean, I suppose we don't, do we understand why? Okay. So why are these variants that we thought were always going to be pathogenic as a term we use not? Well, look, there were lots of theories, including the fact that the genetic background, like the other variants that you have make a difference.
00:37:53
Speaker
that various environmental factors make a difference. There are many different flavors of theory, but I'd say no. I'd say that this issue of penetrance is a big kind of relatively open question in the field right now. Let me just say something about other complications. So we know that most common conditions have a genetic component, right? So something like type two diabetes
00:38:23
Speaker
We know from family studies, these are typically studies where you're comparing identical to non-identical twins to get some sense of the kind of genetic, the fraction of the variance in the trait, which can be attributed to differences in genetics versus differences in environment. Okay. There's a whole other conversation to be had about how reliable you think these numbers are, but we talk about some traits being more heritable than others.
00:38:49
Speaker
Um, so something like height, uh, is highly heritable, maybe 80, 90%. Um, it's, uh, the, the heritability depends also on the environment. Um, if you're in an environment where there was, um, like not, not everybody was getting good nutrition, then you would, you would measure a different value for heritability, for example. Um,
00:39:15
Speaker
So, okay. So what does that mean? It means that we expect to find genetic variants, which are associated to height. Um, but the same can be said for, it turns out in fact, to an embarrassing extent, um, most traits that we can measure in humans. Um, so it certainly turns out to be the case for IQ for the number of years that you're going to stay in education.
00:39:44
Speaker
for your predisposition to heart attack, to type two diabetes, to get common cancers, to your agreeableness and other measures of your personality, to how likely you are to get divorced if you get married. Like all of these things have a heritability which is much higher than zero. And so how do we go and try and find the genetic variants that are associated with those? Well, we,
00:40:13
Speaker
let's take the case control setting so people who do get a heart attack before age 50 say and those who don't, you're literally just comparing all of your cases and controls and you're trying to find variants which are more common in your cases than your controls. That's called a genome-wide association study or GWAS and
00:40:34
Speaker
One of the big lessons from the whole field of GWAS is that we needed really, really large sample sizes to get off the ground because it turned out that all the variants, that there were variants to be found, but they all had very small effect sizes. So you

Genetic Data and Research Challenges

00:40:51
Speaker
needed really large numbers in your studies to be empowered enough to detect them. So, okay. So GWAS started being successful, really successful, maybe like
00:41:02
Speaker
seven years ago or something like that. And at this stage for like a lot of these conditions, for example, schizophrenia, we've identified quite a lot of these variants, which makes somewhat of a difference. But all of these variants make much less of a difference than we thought that they were going to. But you can start to aggregate the
00:41:25
Speaker
many effects of these variants each of which have a small effect into numbers that we call polygenic risk scores and those many would argue are starting to be predictive enough to be used in clinical practice. Right so just just to make sure I got it so you have I mean your kind of your early work was looking at kind of single genes and how
00:41:50
Speaker
for instance, I think there's a single gene associated with Huntington's disease, right? That's right, yeah, yeah. So they're kind of like the clearest signals that you can get. I mean, in particular, Huntington's disease is like very clear, like that's, as you say, completely, was it permanent? Penetrant. Penetrant, sorry. Completely penetrant. So if you have that gene, like, you're going to have that disease. Whereas there's other cases where it's sort of like, well, you might not. And then
00:42:19
Speaker
then get to look at sort of the combination of how genes can affect particular traits and that's the kind of where the poly comes in in the polygenic. Right and the models today are almost all strictly additive so you find these these
00:42:40
Speaker
you've just done a big sort of statistical analysis and you've identified that there's, that this, if you have this genetic variant at this position, you have a slightly increased chance of developing, let's say type two diabetes to take an example, but then you found a few hundred of those and you literally just add them up. Okay. Proportional to their effect sizes. This is the simplest case. And you get out a number,
00:43:10
Speaker
And that number is a bit higher in the cases, the people who develop type 2 diabetes in your validation data, this is, than the people who don't. And then there's a question of, so that means it's somewhat predictive, right? And the question is, how predictive does it need to be to be useful to share with generally healthy people?
00:43:37
Speaker
And then to get back to your killer app question, like could this be a killer app for people wanting to go and get sequenced because they would get this information.
00:43:49
Speaker
which might be actionable for their health. Yeah. And there's perhaps like a little bit of a chicken and an egg situation going on here and then, yeah, suddenly those polygenic risk scores are predictive unless they're just noise, right? And they might just be noise if the effect size is low and the sample is small. And, yeah, they're only
00:44:16
Speaker
they're going to get better as we have sort of larger samples of, or more people whose genome has been sequenced. But it's like getting the incentive for them to do that is what you need like these good, where you need good polygenic research. Right. I mean, we basically, we need people to donate their data. So,
00:44:44
Speaker
If you sign up for 23andMe, you're going to get asked if you're willing to let your data be used for research. This is in an anonymous fashion. There's a sort of like side story about whether genetic data can truly be anonymous, but currently it's viewed as anonymous, right? So are you willing to let your genetic data be used in an anonymous fashion for research?
00:45:07
Speaker
About 80% of 23andMe's customers say yes. So that's a big, that's a really big data set. The UK Biobank is probably the single most used data source right now. That's the cohort of like 500,000-ish Brits who are now all in their 60s and 70s, I think.
00:45:28
Speaker
Um, and how was that? Uh, is that full genome sequencing from that or is it like building up to that? I believe they've released whole exomes, which is the protein coding part of the genomes already. And I think they're working on the genomes. And this is, this is where some people start to get their heckles raised because the way they fund that is they partner with farmer companies who are willing to pay, um,
00:45:53
Speaker
and they get back something in return. Often, I'm actually not sure exactly how it works with the UK Biobank data, but often they'll get exclusive access for 12 months or something like that. So those kind data donors were willing to give access to their data to see a certain amount of control over their data in order to further science.
00:46:23
Speaker
you know, thank you to all those people.

Ethical Concerns in Genomic Data Use

00:46:26
Speaker
Um, but something we find is that it's, it's non-trivial to get people to sign up to these studies. People have a lot of concerns about, about giving away their data like that. Um, yeah. So yeah. Um, privacy, people worried about privacy, I think often in quite a sort of, um, abstract way. Yeah. Um, and worried about confidentiality.
00:46:54
Speaker
It's also the case that, so one of the things that research teams will do to help sort of make it worthwhile for people to give up their data is to offer to return results to them that will be very medically actionable. And that probably does incentivize some people. But then you also get this issue of genetic discrimination, which is,
00:47:23
Speaker
If you know something about your health and you don't share it with a life insurer, that's fraud. So in the US, there is no protection against genetic discrimination at the federal level when it comes to life insurance. There is for health insurance, for most health insurance and for most employment settings, but that's it. There are various other settings in which there is no protection.
00:47:52
Speaker
Um, so, uh, you know, it could be the case. I would say this is all very, you know, to a large extent still theoretical at this point, because there really haven't been that many cases that you, you know, you sign up for a genetic study, you find out you've got some sort of, uh, a variant that puts you at an increased risk for a disease. And there aren't any actions that will fully mitigate against that increased risk that you can take and a life insurer might
00:48:21
Speaker
say, no, we don't want to take you on or charge you higher premiums or something. So people worry about genetic discrimination as well. And then, yeah, so people probably rightly think twice before they are willing to give over their genetic data.
00:48:41
Speaker
Yeah. I mean, it, you know, on the one, at one level, it seems like there's almost a moral imperative to do this. Yeah. But yeah, it's, it's complicated. As you say many times, like this is challenging and it seems like, yeah, there's kind of opposing incentives between, you know, for farmer companies, like there's certainly a benefit for
00:49:06
Speaker
wanting everyone to do this, but for individuals, the life insurance gives like a really good, and I wasn't aware of that, like that's really. And it's not even that kind of concrete, for some people it is, the people who have read the fine print of the consent forms, for example, but for a lot of people I just hear, oh right, but then my data might get shared for something that I don't support.
00:49:36
Speaker
uh, you know, 23andMe might get bought by some player that I don't like what they're doing. Um, just, you just, to a large extent, you lose control over what happens to your data. And I think a lot of people don't like that. A lot of people, um, also are very willing to think, oh, if we're, if it's kind of academic scientists, we're happy to support their work by sharing our data. Yeah.
00:50:05
Speaker
But if it's connected to the for-profit sector, then we feel iffy about that. Even though, of course, the for-profit sector is ultimately motivated or to a large extent to discover the new medicines that are so sorely needed. And I would also say that those individuals who are sick with a suspected genetic component or very often the families of the kids who are,
00:50:34
Speaker
they don't have any of these qualms. They're like, take our data, share it with as many people as you can, help us find some answers. Yeah. Yeah. And I wonder, I mean, I think with privacy, the, it's, it's a moving window in terms of what, what people regard as
00:50:54
Speaker
fair in terms of data sharing. I think it's changed a lot. It seems to have swung one way and back again, at least for things like location sharing, where I think people were at first pretty blase about, oh, I'll share my location and I'll get some
00:51:11
Speaker
you know, service back from this. But then a lot of apps would just ask for location, which had no real obvious need for that. And now I think it's got much stricter and similar for maybe similar for photo sharing and like pictures of oneself appearing online, you know, even if, you know, I didn't put them out there, you know, that
00:51:37
Speaker
we've kind of grown fairly comfortable with that because sort of now in the public space it's just what happens like you know your photo gets taken by a friend or something like that and I I'm not seeing it happen but I do wonder if this might happen with gene sequencing I mean our DNA is like everywhere right and the cost of sequencing it is so low um that
00:52:06
Speaker
it just makes one wonder well you know at some point will I even have a reasonable expectation of privacy say if I go to a restaurant or something and they they want to do a survey of their their diners to sort of understand like oh can we like optimize our flavors you know just get some completely impossible sci-fi scenario here like and they just like extract everyone's DNA and they're like oh yeah
00:52:29
Speaker
next time you go and I've got the best, you're going to have the best experience eating here today because we've built this dish for you. Well something that you're touching on is a field called environmental DNA. It's actually really pioneered by people studying things like invasive species or species at risk from climate change and changing environments, which is extracting DNA from the environment because
00:52:53
Speaker
you know, organisms shed DNA as they move through their worlds. And we are no different, it turns out. So even if what you're wanting to study is, you know, endangered turtles, if they're hanging out anywhere where humans also are, you're also going to gather a whole lot of human DNA. And this falls in this, like, a little bit of a gray zone from our sort of consent frameworks, etc.
00:53:22
Speaker
And I think that's a very interesting area to watch when we're going to need new norms. So it's in my field, my sort of academic field is called LC, which is the ethical, legal and social implications of genetics and genomics. They got a good acronym for that. Yeah.
00:53:44
Speaker
And we should say as well, so after working sort of in industry, trying to solve these problems, I guess, I suppose you encountered another kind of problem, which is like the... I was finding that I was, so I'll just close out the comment that I think we need a new sort of subfield of LC of environmental DNA. But yeah, I was finding in my work in the genetics industry that these sorts of ethical issues were popping up. So I'll give you just a couple of examples
00:54:14
Speaker
One was connected to the life insurance thing that I was mentioning before. Like

Race and Ethics in Genomic Interpretation

00:54:19
Speaker
the first time I had to write a consent document for individuals to participate in genetics research.
00:54:26
Speaker
I had to copy and paste a paragraph of text, which basically explained what I explained to you about not being protected in case of life insurance. And that just felt so wrong. It's like, what? We should have designed the system so that... We should do everything that we can to try to, you know, incentivize people to do this. Exactly. And I should just clarify that I was explaining the situation in the States varies by country. Right.
00:54:54
Speaker
Uh, so that felt really, and another thing was connected to this process I was describing before where you're interpreting a variance and seeing if it's actually linked to a condition. There had been new guidelines published that says that you should try and match the frequency that that variant occurred in, in, um, I think it said a race matched way, which seemed to me like really weird, like,
00:55:21
Speaker
you know, what we know race is socially constructed, et cetera. And indeed that is, and that's an area that I've subsequently done a lot of work in, just sort of some very unclear concepts in the mix that people were then trying to kind of codify in systems that were designed for clinical practice. That was what I was trying to do. I was needing to read this paper that was telling me how to interpret variance.
00:55:48
Speaker
and sort of coded up but there were sort of big gaping holes between just everywhere basically in terms of the concepts at play and how they were operationalised. Yeah and then you know around this time
00:56:08
Speaker
and I was still working in industry, CRISPR was really taking off. And with a colleague who was studying for her law PhD at the time, Sarah Poulkes, she and I wrote a few articles for academic journals as sort of weekend projects. So that was me dipping my toe in the water that was on various aspects of CRISPR, so gene modification.
00:56:36
Speaker
And I decided to try and get back into academia. And I found out, in fact, to migrate good fortune, that ELSI, this ethical, legal, social implications of genetics and genomics, is probably the single most advanced and most well-funded field in applied ethics that there is, full stop. And the reason for that is historical.
00:57:04
Speaker
when they were contemplating the Human Genome Project, there was an agreement to set aside a small percentage of the overall budget for what became known as ELSI. Apparently Watson might have agreed to this because he thought it would shut
00:57:31
Speaker
Um, but, but, um, you know, some work was done alongside the sequencing of the first genome. And ever since, um, there has been a pot of money set aside for LC. So currently the sort of the Institute that funds genomics research in the States, the National Human Genome Research Institute, it's part of the NIH. It has a federal mandate to spend 5% of its money.
00:58:01
Speaker
on LC research. And that gives a continuous stream of money for that field. It enables people, has like big training grants to support people to come into that field. And it's, it's a pretty, it's a pretty busy field. For example, neuroethics, which I think should be at least as large a field is minuscule in comparison. And it's because of this like federally funded structure.
00:58:32
Speaker
And that is basically what's enabling me to break back into academia. There aren't really any departments that fund this kind of work and that would be like, oh yeah, come and be a faculty member of my department. It's like this really weird interdisciplinary thing, but there is grant money to support it. Okay. But no, I was lucky in that
00:58:58
Speaker
my current boss, a political philosopher called Danielle Allen. She's an absolutely amazing character of the life. She had got interested in polygenic risk scores, which we were discussing earlier because a friend of hers, a geneticist had said, Danielle, I'm really worried about what's happening in my field. And from an ethics point of view, please help. So,
00:59:26
Speaker
So Danielle had got interested in these polygenic risk scores. And when I decided to leave industry and to try to get back into academia, there were very, I found very, very few options of places that I could apply. And one of them was to Harvard's Epic Center, which has a fellowship in residence program. This is the center that Danielle runs. And when she saw my very
00:59:53
Speaker
quirky activation. She was already trying because she was already interested in polygenic risk score. So I think that was, that gave me my foot in the door. And yeah, and I've been back in academia for four years now.
01:00:14
Speaker
thinking mostly about genetic ancestry, about polygenic risk scores, and then some other things as well. Yeah, there's a lot to talk about all that we can talk about from here. And I do want to come back to CRISPR, because I found your work with Sarah on this really interesting. But maybe if we talk about race, because I think this is where you've really been
01:00:40
Speaker
you know, trying to change, I guess, the way that research is being done recently. So maybe if you start by telling us, like,
01:00:53
Speaker
you know, how race was slash is being used in biology and what's, how it's being trying to move on from there and the degrees of success in which it's managing to do so. Right. Yeah. I like, I like to frame this as the opportunity, the danger and the path forward. Okay. So I guess maybe we just start by saying like,
01:01:21
Speaker
race is a social construct and then people are like oh well everything's a social construct it's like yeah but like uh if you look at early 20th century US an Italian would not have counted as white yeah so it's just so sort of space and time specific these categories that we put people in yeah um
01:01:44
Speaker
And it's, you know, it's almost always a function of whoever's benefit it is to put people into these different categories. Yeah. They've not been designed for biology. They've definitely not been designed for biology. Although, you know, really, I think, you know, the people who originated some of these concepts were sort of like, I mean, Linnaeus, for instance, was a botanist. Right. But they just didn't have any, like, it's clear they weren't really looking at the data so much as just
01:02:12
Speaker
That's right, that's right. And they were, you know, creatures of their time and their political environments. Yeah, so I think, so here's the opportunity. So, particularly, two pieces of context.
01:02:34
Speaker
One was the COVID pandemic really just highlighting these awful race-based health disparities in outcomes. And then the other was the Black Lives Matter movement, which just made people particularly aware of the way that race structures lives. And so the people were then really looking
01:03:04
Speaker
looking at the practice of medicine and sort of biotech more generally as well and noticing that race was often baked in to clinical algorithms just to give one example in ways that were not really justified or justifiable. A really famous example is for kidney function like you estimate this is done like
01:03:30
Speaker
I don't know, hundreds of thousands of times a day, probably. Maybe not hundreds of thousands, but just super, super common practice across clinical medicine. And you fill in various things like age and you get some results from blood tests. And then one of them is like, is this patient African-American? Yes or no. And that adds a correction factor.
01:03:51
Speaker
So in a way that basically ends up systematically making it harder for African Americans to qualify for kidney transplant, just as an example. So a lot of people started looking at these algorithms and being like, oh, hang on, wait a second. Why are we doing that? And in almost all cases, there weren't good answers. So there's a whole sort of process of trying to
01:04:22
Speaker
question use of race as a variable in biomedicine. And another piece of context is within genetics and genomics, we've had much more attention on kind of the diversity of the samples we have. So for example, the polygenic whiskers that we were talking about earlier,
01:04:49
Speaker
They generalize well for training data, for like to individuals who look like the training data and not beyond that. That's another conversation about why, but anyway, they don't. So people are like, oh dear, we've really got to rethink who is in our data. So both environment and as a whole, and in my field in particular, people were thinking about,
01:05:19
Speaker
who's in the data, what's the appropriate way to categorize them, et cetera. And the folks over in the sort of clinical biomedicine world, a lot of people were saying, okay, race. Clearly it's not appropriate to use that. But genetic ancestry, that's something that is, you know, evidence-based, objective, fixed characteristic of the genome. We should be using that.
01:05:48
Speaker
And what is, yeah, maybe tell us what is, well, how are they characterizing genetic ancestry? Well, the dominant way is with these continental labels like African ancestry, European ancestry, Asian ancestry. And this is the danger because those are in practice directly conflated with the race-based categories that we're wanting to move away from. So black, white, Asian.
01:06:14
Speaker
Yeah. That doesn't seem much between saying like European and European or like, you know, white European versus European ancestry. Right. Right. And some people will say, Oh no, there's this notion of add mixture, which is part of the solution though. It's part of the problem. And because it also assumes that you've got these pure types that the basic error is to think that humans come in some small number of basic types. Um,
01:06:44
Speaker
And that's what this use of continental ancestry categories really perpetuates. It's the idea that you're a European ancestry person, or you're an African ancestry person, or maybe you're some admixture of the two. Humans do not come in a small number of basic types. And it's not even a good model. So there's a,
01:07:12
Speaker
You know, humans are interrelated in this ginormous human family tree. A tree is actually not a good metaphor. It's more like, some people like to say, a braided stream. Or a kind of web, I guess. Yeah, a kind of web.
01:07:28
Speaker
a directed acyclic graph basically. Very nice. And DNA has been passed down through that web. And there's an object, a mathematical object called the ancestral recombination graph, which just captures that idea. And then for any individual, what myself and collaborators want to say
01:07:55
Speaker
And this is really beautifully articulated in a piece called What is Ancestry? by Ian Matheson and Alwyn Scali. But genetic ancestry just is this ancestral recombination graph. So for an individual, your genetic ancestry is a sort of subset of paths through your family tree by which you have inherited the DNA that you have.
01:08:20
Speaker
And that big old human family, non-tree, is structured in all sorts of interesting ways. It's structured by everything that influences who has children with who, which includes geography, it includes language, it includes religious practices, it includes all sorts of other things. And to default to this same set of big, big old categories,
01:08:49
Speaker
is just far too much of a gross oversimplification of what's actually going on. So we advocate that first of all it's really helpful to get conceptually clear about what genetic ancestry is because it makes clear that a notion of a group of dividing that tree up into chunks is not inherent to the concept
01:09:15
Speaker
It's not, it's also not inherent to the concept to sort of label the individuals in that graph with anything other than their genealogical connections. Now you can choose to impose groups where you can choose to add labels, but those are choices which like individual research teams can make depending on what their use cases, what it is they're actually trying to do. And, uh,
01:09:42
Speaker
And something that I do think that we have been learning across genetics and genomics is a lot of the use cases where we thought that what we needed to do was to carve up that tree. It turns out it's better to instead treat these sorts of patterns of variation in a continuous fashion behind the scenes. So I think that moving away from these big continental ancestry categories
01:10:11
Speaker
is good science and it's also good ethics. And to say more about the good ethics piece, like the concepts that we use don't just frame intellectual inquiries in a subject like genetics. They also frame, um, much bigger questions. So, uh, actually in this case, it's not even just clinical medicine or public policy.
01:10:38
Speaker
It's also the very way that we sort of might think about our species and how we're all related to each other, sort of existential questions almost about our species. And yeah, so the stakes are quite high to get this right. And genetics not only has kind of a very intertwined history with racism and eugenics, but it continues to be basically the dominant philosophy of white supremacists.
01:11:07
Speaker
And yeah, like this is just such a sad picture in the States, but one of the mass shootings from last year in Buffalo in which 10 African Americans were murdered, that shooter had posted a manifesto online, which cited a bunch of genetics research to sort of justify his beliefs.
01:11:32
Speaker
Um, and that's not uncommon. Like there's all sorts of nasty corners of the internet where a lot of, particularly the figures actually from genetics papers get reproduced and get said, look, we really are from these different groups. Genetics proves it. We're genetically superior, they're inferior, et cetera. Um, so, uh, it's, it's really like, yeah, high stakes to try and get to seize this opportunity when people are reconceptualizing human
01:12:02
Speaker
biological difference to not just make the same mistakes that we've been making over and over again in the field of genetics and genomics over the last 100 years. So. Yeah. Yeah. Yeah. There's a lot to unpack here. Yeah. This is really fascinating stuff. I think just on your last point, I'll link in the show notes. There's a really great article on Undark about that shooting, which references a lot of your work.
01:12:30
Speaker
and it talks as well about how the the shooter had he'd not only referenced or you know texts talking about race from biology but talking about um ancestral um you know inherit like i'm sorry i've forgotten the name the um ancestral categories oh yeah um which which showing that obviously yeah i mean it it's obvious that those things are going to be conflated i mean they're almost not even
01:12:58
Speaker
it's not even conflation necessarily, then there's just so closely aligned. Right, and in other empirical work I've done, you know, you see this conflation made by scientists, by doctors, by, you know, regular folk, like everybody complates the categories. And it's not helped by the fact that this kind of concept ancestry, it's not even really a concept, it's a mishmash of things. And again, I've done a bunch of empirical work on this,
01:13:27
Speaker
when we've asked scientists whose work like uses this concept um quite substantially what they actually mean by it yeah and they have a really really hard time answering and the more honest ones say things like um well my mentor said we used to use ethnicity but now we should use ancestry yeah or
01:13:53
Speaker
Well we wanted to say mixed race but that didn't sound very good so we used mixed ancestry. Yeah it's just like a PC term. It's a PC term. Like another quote that I remember it was something along the lines of you know quite frankly I'm willing to say it's a dodge it just doesn't raise people's heckles in the way that race or ethnicity does. Right. So so like
01:14:15
Speaker
It's just very, very unclear what it actually is. So something we advocate for myself and collaborators is you should never just use the term ancestry. You should qualify it with genetic ancestry or genealogical ancestry or something that's actually like
01:14:34
Speaker
name something specific. The term population is another big bugbear of mine. Like it sounds scientific, but again, if you ask people, include scientists, including scientists who are using the term over and over again in their work, what they mean by it, you're not going to hear the same answer twice. Like if you're a population geneticist and you're really working with, uh, you know, a nice clean, simple model, a population is a group of individuals who are mating at random. That is a population geneticist.
01:15:03
Speaker
a definition of a population, only ever appears in models. People generally don't make it random. I'm not seeing that at all. Yeah, right. Not a thing that generally happens. And then if you are a statistician, the population is like the group of people that you think your sample generalizes to. For most people, population means like, oh, it's the population of Edinburgh, or it's the population of the UK, right? It's like, it's a demographer sense of population.
01:15:33
Speaker
But if you're reading like a public health article or a medicine article, it's completely unclear what is meant and it sounds scientific. That's what's dangerous about it. It sounds like you're actually using some clear concept and you're totally not. Like again, I remember one interviewee in the study I did,
01:15:53
Speaker
was like, yeah, I have actually thought about this. And I've worked out what it means. It means large n. And I think that's probably the most honest answer. That's pretty much what it means. That's great. I mean, as you say, like the, you know, there's dangerous hits, but there is an opportunity and the stakes are high. It's not there is
01:16:13
Speaker
like a moral imperative to get this right and there's also a scientific incentive because the problem that we're coming back to are things like getting your polygenic risk scores as accurate as possible because as you say like the the quality of those scores
01:16:31
Speaker
depends on the the individual that you're trying to make a prediction for to say you know for example you have increased risk of type 2 diabetes or whatever
01:16:45
Speaker
they need to be, the conclusions need to be drawn from, I don't wanna say population now, but from a sample, right? Based on a sample that is very close to, you know, that individual's genome in a meaningful sense. And this, you know, the kind of answer that you're suggesting is something around, is the ancestral recombination graph, I guess.
01:17:08
Speaker
Right. Or you might even say this is a use case. There's a, there's a wonderful geneticist in the UC system called Graham Coop. And he just says, look, for this kind of thing, what's important is a measure of genetic similarity. That's a lot. We don't need to even invoke anything to do with ancestry. Right. And through time slices. And
01:17:29
Speaker
Yeah, we need to somehow like control for genetic similarity. And by the way, one of the main reasons why it's thought that these polygenic scores do not port well to other groups is that they've been trained on the type of data that we were describing that 23andMe, et cetera, gets that, which is just this genotype data, which is where you're just looking at a few hundred thousand points of the genome. They haven't been trained on looking at the whole genome. Right.
01:17:59
Speaker
It's also the case that like genetic variants are correlated with each other because of the way DNA is inherited. So two variants that are literally like closer together on a chromosome are more correlated with each other than ones which are further apart on a chromosome.
01:18:17
Speaker
And so what we've been finding in many of these GWAS, it's not the kind of mechanistically causal variants that are actually having an effect on a phenotype. It's variants that are correlated with other rarer things that we haven't directly got data for.

Accuracy and Implications of Polygenic Scores

01:18:36
Speaker
And those patterns of which genetic variants are correlated with which, those vary across the ancestral recombination graph.
01:18:46
Speaker
So that's thought to be the main reason why these scores do not port well. But it's also the case that the environment matters and that even if you
01:19:01
Speaker
all other things being equal, if you're looking at individuals who've say developed type two diabetes in very different environments, then that also matters. And that's also the way in which you as an individual can be different systematically from the training data. So yeah, again, it's complicated.
01:19:27
Speaker
It's not clear how we should deal with this. And one of the things I like about my field is there's a sort of questions about how should we be doing research, what concept should we be using, et cetera. And then there's just very practical challenges. It's like, okay, if you want to design a report to share with an individual about the risk, genetic risk for type two diabetes, how do you
01:19:54
Speaker
think about that, how do you actually do that? So there's all sorts of challenges ahead. Yeah, I just want to make sure I've understood the poly, sorry, the ancestral recombination graph, because it sounds like, as you say, this may not be, it's not the only tool necessarily that we need for these problems, you can just look at similarity across people's genomes and use that as the way of understanding the
01:20:23
Speaker
predicting portability of results. But I guess as you're also saying, well, it seems like the ARG or ARG is capturing something important and it specifically might be kind of solving for this problem of like, I don't know, co-linearity or something where you have several things like this is just a classic problem, I guess, in machine learning where you have multiple variables which
01:20:49
Speaker
you don't want two correlated variables going into a log or something. It spews out spurious predictions. And that's a way of controlling for this. And one thing I really liked about your explanation of the arg when you first gave it is that it's not this kind of flattened thing which uses all the dimensions that genetic ancestry does if you're just saying, oh, genetic ancestry is
01:21:17
Speaker
you know, European or something. It's just like this block of space, right? It's incorporating, like you said, it's capturing
01:21:28
Speaker
Who was, who was having sex at who? Right, right, right. And like, that depends on, like you said, language, geography, time, obviously as well, like different. Yes, the time piece I think is really important. Like, so, so just to be clear, I, you know, following others, I'm saying genetic ancestry is just this ancestral recombination graph.
01:21:48
Speaker
One of the things that enables us to do is to appreciate that it's a historical concept and not one of like essences. So we can think about, you know, you can think about three generations back and what did that look like? Or you can think about 30 generations back, etc, etc, etc. Any imposition of one set of categories is implicitly interested in just one slice of that kind of historical picture.
01:22:15
Speaker
And yeah, and that's an incredibly rich object. We're never gonna be able to have that object sitting on a hard drive somewhere. It's more that it helps us get clear on what the concepts.
01:22:28
Speaker
is and is not. And so genetic similarity, okay, I kind of go back and forward about how useful a concept I think this is. But of course, then you've got to define what it means for two people to be similar. In practice, we usually do that using a dimensional reduction technique
01:22:48
Speaker
And it's still the case that the bog standard principal component analysis absolutely dominates the field. So that's just a dimensional reduction technique. And then you just look at how close in space in that sort of dimensionally reduced space to individuals are. But of course, that dimensionality reduction
01:23:11
Speaker
is defined by the data that you put into it. So another problem we've had in the field is like the starting data we had was sampled in such a way to kind of
01:23:24
Speaker
Basically, if you wanted to design a sampling scheme to make humans look like separate groups, you couldn't have done it better. It was kind of, you had to be an individual with all four of your grandparents from the same small village, et cetera, and then they were sort of spread out around the world in these particular ways. And that's what gave us these sort of initial visualizations of genetic similarity. You look at real data.
01:23:50
Speaker
And it looks nothing like that. It looks like continuous. So it's just like a matter of experimental design that it seems like, you know, if you look at maybe some of the early data from genomics, it looked like they were very distinct groups, but that was just the way it would be sampled. Yeah. And like so often you hardly ever see anybody from like North Africa or the Middle East and those reference data. They just, yeah, it was, it was sort of designed, I think, to,
01:24:18
Speaker
Like I said, you couldn't have designed it better if you wanted to make it look like humans came in these sort of distinct clusters. But anyway, with any measure that you have of genetic similarity is just some kind of low dimensional representation of the ancestral recombination graph.
01:24:41
Speaker
So, yeah, that is kind of the object that contains all the information. And then you have to make choices depending on your use case about what kind of summary of that makes most sense for what you're trying to achieve. Okay. Yeah. Brilliant. I think I want to, we don't have much time left. Or it depends on the patience of our listeners.
01:25:09
Speaker
Yeah, I want to very quickly go over some of the kind of crisper things because I think I'm really interested in your ideas around or particularly why there doesn't seem to be, you call it the non-germline non-controversy.
01:25:33
Speaker
And how there's been a lot of attention with CRISPR, so this gene editing technology whereby we can basically cut and paste bits of DNA. A lot of attention on doing that in heritable portions of our work.
01:25:51
Speaker
or places where the DNA would be inherited from, so doing it in basically eggs, or fertilized eggs I guess, which would make kind of permanent changes in genetic makeup that could be inherited generation after generation.
01:26:09
Speaker
There doesn't, well, two things are striking to me about this. There doesn't seem to be much focus on, or as much attention of, on uses of CRISPR, the ethical implications outside of the germline, suggesting changes that would influence an individual during their lifespan. And also I'm really intrigued why there's not been the same level of concern about changes in
01:26:40
Speaker
bacteria like microbes and stuff where people are making edits which are like household like there's not this kind of like germline non germline distinction yeah but you just make an edit in a bacteria and if you if if you do it right it's just going to carry on okay yeah uh so sorry i've asked you a lot of questions there yeah thank you i mean okay so let's talk about humans first so that that was actually the very first sort of bioethics paper that i wrote
01:27:09
Speaker
And I had had this observation just watching the dialogue in the field that we tend

CRISPR and Genetic Modification Ethics

01:27:15
Speaker
to think of this sort of two by two matrix of the types of modifications that we can make to human DNA. Yeah. So as you've said, there's germline versus non germline. And germline just means does it have the ability to be passed on to the next generation? So it's, it's, you know, sperm, egg cells and the progenitors.
01:27:40
Speaker
and sort of fertilized embryos and things that are very early stages of embryo development, if they affect what's going to go on to then produce more spurs. Yeah, what's going to go into the next spine, that's right. So we call that germline, we call everything else somatic or non-germline.
01:28:01
Speaker
And so that's one of the dimensions and the other dimension is therapy versus enhancement. So that's a two by two matrix and people have, we're generally saying, well, look, um, there's, uh, this, the thing that's really new with CRISPR, um, is it gives us this ability to modify the germline that poses all sorts of ethical questions about our species, et cetera, et cetera, et cetera.
01:28:31
Speaker
And that was getting absolutely the lion's share of all of the attention. Then there's sort of the therapy and enhancement divide and people were saying, including some of the big sort of reports that came out early on about this that, okay, maybe we could carve out some sort of therapy instances where it would make sense eventually if we sorted out safety and efficacy. Right.
01:29:00
Speaker
to do germline modification for therapeutic purposes.
01:29:03
Speaker
never gonna be okay for enhancement purposes. So for example, I mean, Huntington's disease, again, like, were it possible? Right, like if you were to, and here's the thing, and this is a point that like Hank Greeley, who's a biotherapist at Stanford, has made and many, many others, the actual use cases for doing germline modification of embryos are very, very small because you could achieve the same things by screening embryos. Right. So, you know, you would just
01:29:33
Speaker
you know, produce a bunch of embryos through IVF, you would screen and you'd expect to find 50% of them in this case that had, if you had an individual with a Huntington variant, an individual who didn't, you'd expect 50% of their embryos to have the variant and the other 50% not to. So you just choose to implant the ones that didn't. You can think of some edge cases
01:30:02
Speaker
where that strategy might not work, but we're really, really talking about, um, big edge cases. Um, yeah, this, this idea that, and then that, that's, that's the whole of the conversation to be had about the sort of the ethics of screening embryos. That is something that is, um, completely standard today. Um, uh, in the case of the U S completely unregulated in the case of the UK, quite regulated.
01:30:29
Speaker
Um, so, uh, yeah, so, so, so people say, okay, well, maybe, maybe, maybe there were these edge cases that we could justify.
01:30:39
Speaker
human germline modification for therapeutic purposes but enhancement always going to be morally questionable. So enhancement is making something better and therapeutic is fixing it basically. That's right but it's a it's a very hard line to draw it turns out. Yeah you have some great examples of this but yeah maybe sorry carry on with your current thought. Okay so then this other axis you've got is
01:31:07
Speaker
germline and somatic and the sort of, you know, the killer app for CRISPR when it comes to humans is gene therapies. So this is, you've got existing individuals who are suffering from monogenic conditions. Um, and the idea is, is that just sort it as its source, go in and edit the gene, which is not functioning properly.
01:31:34
Speaker
put in a functioning copy and you should completely cure the disease. That is the theory and there are some approved and there were loads more in the pipeline and there were still many challenges to be overcome like how do you, you know, safety, efficacy, how do you get this sort of the CRISPR system to the cells that you need to alter? But there are
01:32:03
Speaker
there's kind of like a pipeline at this point of therapies which look promising. So that's that quadrant, the kind of therapy and somatic everybody seems pretty comfortable with. Their safety and efficacy concerns, yes, we should really be thinking about those. And then, yeah, but that first article was about this other quadrant, which was really about
01:32:29
Speaker
uh yeah so non germline and non therapeutic like that's that sort of um somatic enhancement um quadrant of that two by two which had just received basically no no attention in the literature and that we wrote about and for example one of the issues is um uh you know many of these genetic changes the way the gene
01:32:58
Speaker
the genetic variants end up influencing the characteristics or the phenotype of the individual is via influencing development. So if you wait until an individual is an adult, you're not gonna have that effect. So you've got all sorts of issues there about potential genetic modification of children. For example, to increase their chances of being taller.
01:33:28
Speaker
Uh, yeah, another, another example might be other kinds of things that would give people an edge in, um, the sporting world. But I do think that there's, there's sort of things that come up in that, that might well come up in that space. And there's certainly a group of people who have proven themselves willing to sort of experiment on themselves. Um, the so-called biohackers. So yeah. So that's humans.
01:33:59
Speaker
You also brought up non-humans. And that is a very unregulated space. And you were talking about bacteria. And when we talk for bacteria and viruses, there's a whole field that's called gain of function research.
01:34:20
Speaker
that has achieved a level of notoriety during the COVID pandemic, because something that actually is kind of standard practice in a lot of labs is they'll take these organisms, which have really, really short generation times, and they'll see if they can induce gaining new function. So if it's in the case of bacteria, it might be the ability to metabolize something new, or in the case of a virus,
01:34:49
Speaker
the ability to evade some aspects of like the mouse immune system. And this has been the subject of a lot of debate and such research was paused for some time in the US under the Obama administration, but was restarted. Of course, it's then really tied to
01:35:17
Speaker
all of the debates about how secure are supposedly secure laboratories for doing this type of thing. And yeah, I expect to see it come back in the news. This is not my area of expertise, but it does seem a little bit like playing with fire. Given that there've been so many lab leaks,
01:35:46
Speaker
multiple recorded lab leaks in many different countries over the last few decades. So yeah, I expect that it will
01:35:59
Speaker
be a topic that gets revisited. And of course, the scientists would say, well, look, yes, there is a risk here, but there's huge countervailing public benefit from the type of research we're doing. And so, yeah, that needs to be the consistent focus is like, what is the public benefit? Make sure that's convincing. And it's not just, you know, this is the thing that
01:36:26
Speaker
turned out to be good for my career or whatever. Yeah, we'll do it because we can. Yeah, exactly, exactly. I think recently George Church's lab, which I guess is just down the road from you in Harvard,
01:36:40
Speaker
I think they announced that they've modified E. coli to make it completely virus resistant. And one might say, oh well, I don't want E. coli doing that. But I guess E. coli is really useful in the synthesis of lots and lots of things, lots of chemicals that are really complicated to make chemicals. They're really good at making it. We can modify them. And so yeah, you can see the benefit of that. But it's very hard to do those
01:37:07
Speaker
you know, risk analysis, I guess. I think that's exactly right. Yeah, like it totally makes sense that the right framework to use is a sort of risk benefit analysis framework. But how do you estimate those risks? I guess there's a lot of parallels with how we think about AI, etc. Yeah. Yeah, to not clear. Yeah. Um,
01:37:32
Speaker
What do you think, I mean, very briefly, do you think, what is the right kind of ethical framework to look at these kind of things? Like, consequentialism doesn't apply so well. I think there's arguments that in AI, and I think these carry over to this bioethics questions, it's just too hard to predict what the consequences are going to be. So using any kind of consequentialist framework,
01:37:59
Speaker
it's just really hard. And so do you sort of go for a deontological, let's just have a rule about this, or a kind of virtue ethics approach, or do you just try and stick with it, as I think many people are trying to do with AI, and particularly within, I guess, like the kind of effect of altruism communities, like certainly consequentialism seems to be the approach used for AI.
01:38:30
Speaker
Yeah. I mean, I don't think there is one good meta ethical framework here. Bioethics has really been dominated by the use of kind of what you might call mid-level principles. And specifically, the ideas of beneficence doing good non-maleficence, not doing harm, justice and respecting autonomy. I think the principle based approach is helpful. It's kind of a, you know, it's not
01:39:00
Speaker
uniquely consequentialist, it's not uniquely deontological. And

Bioethics Approaches and Challenges

01:39:07
Speaker
yeah, I think it's important to be not exclusively either of those things. I also think that there's a really big role to play for virtue ethics. I think the way I see a lot of my work going is basically sort of trying to produce useful, what are essentially virtue ethics,
01:39:30
Speaker
frameworks for researchers, because my experience, like, I think the alternative is to really say, you know, in this situation, thou should do this, believe me, because we've really thought through the ethics of it, and sort of have rulebook type situation, which is not really, which is never going to happen, because you've always got, you know, no, no one research project is identical to another.
01:40:00
Speaker
So you want to be given researchers tools to think through some of the ethical issues that might be arising in their work. And yeah, that's kind of, that's a virtue ethics type approach. It's about sort of building work capacities to identify and appropriately react to some of these issues.
01:40:26
Speaker
So yeah, so I guess it's a case of letting many flowers bloom and trying to take ideas that seem useful and that seem to work. I think across all of the kind of applied ethics literature, there's been remarkably little kind of work that really looks at what actually, what types of work in applied ethics make a difference.
01:40:56
Speaker
I think we should be doing more of that. So you mean like what actually gets picked up? Yeah, what gets picked up? What ends up impacting things in the way that one hopes? And yeah, my experience of working with scientists is that almost, you know, they usually really understand like some of the ethical issues or get them very quickly when pointed out and sort of feel motivated.
01:41:26
Speaker
to do better, particularly younger, younger folk, but it's just really hard to know what comes next. So trying to fill in that gap. It's non, again, again, it's complicated, but I do think we need more kind of empirical
01:41:52
Speaker
data collection and reflection on what actually makes a difference. Yeah. Yeah. Cool.

Therapy vs. Enhancement in Genetics

01:42:00
Speaker
One final tangent for my final question. You have this great example about the difficulty of drawing that line between therapeutics and enhancement that you kind of alluded to that difficulty earlier. And the example I'm thinking of is to do with Hamadi Granger. Do you remember this that you wrote on your blog and her teeth?
01:42:25
Speaker
Oh yeah, oh gosh, which book is it in? Anyway, somebody has grown her teeth. It's Malfoy. Oh yeah, Malfoy has made her two front teeth grow to rabbit sized proportions and the nurse is sort of fixing them and is asking Hermione to basically say stop when they're at their right size and she thinks that she
01:42:52
Speaker
that before she had teeth that were slightly too large. So she allows them to overshoot and go back to what she considers to be a more, you know, a better size. Am I remembering my example? Yeah, you're remembering your example really well. Yeah. So yeah. So should you, does, you know, was that Hermione going for some enhancement there or was that just part of the therapeutic process? I like a,
01:43:21
Speaker
Another practical example is with height. If you are thinking about adjusting, if you've got a kid and there might be, you can imagine two kids who are both predicted to grow to
01:43:47
Speaker
let's say five, five foot. And in one of the cases, it's that prediction is based on the fact that their parents are both very short. And in another case, it's based on the fact that they had this, they actually had some sort of tumor which suppressed some key hormones at some key points, like, um,
01:44:15
Speaker
you could consider giving growth hormone to either kid. Is it therapy for one and enhancement for the other? And if so, why? You really quickly run into this issue of what is natural, what is normal. And that's very, it's like you might then come up with something like, well, it depends on
01:44:40
Speaker
what it would have been in the absence or the presence of like some sort of counterfactual that you can spin up.

Interdisciplinary Thinking in Science

01:44:48
Speaker
But you know, these are models like any others and there aren't any kind of answers given to us by nature. Yeah. Yeah. I love that you're, to me you're using some of the skills that we picked up in the philosophy side of physics and philosophy degree, creating these like very, what might seem
01:45:10
Speaker
sort of Richerche thought experiments, but they really capture, you know, they get to the core of the problem. And it's great that you've drawn on Harry Potter for a while.
01:45:29
Speaker
Well this has been really fun but we do need to get to the beach because it's a lovely and unusually sunny day here in Edinburgh. I don't know if there's any final thoughts you want to offer listeners.
01:45:47
Speaker
Firstly, I just want to say it's really interesting to see how your career has gone from the physics into the sciences, into ethical questions of sciences. So I think it's a good example for anyone who's thinking about these, that this is possible and these are important things to think about. Oh, yeah. Well, thank you. Thank you so much for having me. This has been a lot of fun. A final thought. I'm convinced that causation, as it's used in science, is in the need of a massive overhaul.
01:46:17
Speaker
And that takes us right back to our physics and philosophy days. So I'm excited to think about that in the future. Wow. What a mic drop. Thanks for having me.

Conclusion and Listener Engagement

01:46:36
Speaker
Thanks so much for listening. You can find the show notes at multiverses.xyz. And don't forget to subscribe, leave a review on Spotify, Apple Podcasts or Overcast, wherever you listen. Cheers for hanging on to the end. Here's till next time.