Introduction to AI Welfare and Consciousness
00:00:00
Speaker
AI welfare is a credible, legitimate, serious issue. We can look for underlying architectural and computational features associated with consciousness.
00:00:11
Speaker
And we know that companies and governments are working to develop exactly those kinds of systems. We now have a global system of factory farming that is going to take decades to dismantle at best.
00:00:23
Speaker
It might have been much easier to guide the development of our food systems in in a more humane and healthful and sustainable direction. But instead, we put off the question and we waited until it was globally entrenched.
00:00:35
Speaker
I think that there could be similar risks with AI development that if we wait until later to have these conversations, then we default to a path of objectifying and instrumentalizing ai systems.
00:00:45
Speaker
Even when someone is made out of different materials than you, and even when someone is vulnerable and dependent on you, you have power over them, you still have a responsibility to treat them with respect and compassion and consideration.
00:00:59
Speaker
And if we want AI systems to absorb those values as they become more powerful, then it would help for us to absorb those values ourselves,
00:01:11
Speaker
Welcome to the Future of Life Institute podcast.
Philosophical Perspectives on AI and Non-Human Minds
00:01:13
Speaker
My name is Gus Stocker, and I'm here with Jeff Sabo. Jeff, welcome to the podcast. Yeah, thanks for having me, Gus. Fantastic. Could you tell us a little about yourself to begin with?
00:01:25
Speaker
Sure, i am a philosopher by training. I have a PhD in philosophy from New York University, and I now do philosophical and interdisciplinary work about a range of issues concerning how we interact with nonhumans and how we can improve our interactions with non-humans. So I work at the Department of Environmental Studies at New York University.
00:01:48
Speaker
And my own research is about the sentience and agency and moral status and legal status of animals and AI systems and other non-humans.
00:01:59
Speaker
And then I also work on teams. So I direct our Center for Environmental and Animal Protection. And this is a research center that examines important issues at the intersection of environmental and animal protection, like agriculture as it relates to farmed animal welfare and climate change, and conservation as it relates to wild animal welfare and biodiversity loss.
00:02:19
Speaker
And i direct the Center for Mind, Ethics, and Policy, which may be a little bit more in scope for our discussion today, which is a a research center that examines the nature and intrinsic significance of non-human
Defining and Visualizing AI Consciousness
00:02:31
Speaker
minds. So examining whether, for example, invertebrates or AI systems have consciousness,
00:02:36
Speaker
sentience, agency, what kinds of moral status, legal status, political status they deserve, along with some other work like our wild animal welfare program and some related projects. And so I spent a lot of time doing research and teaching and service and engagement with companies and governments and NGOs and other actors in these spaces.
00:02:56
Speaker
That's great. And you're exactly right. The scope of this conversation, as I see it, is to talk about ai consciousness and the implications of artificial ah sentience. Great. so So if we start there, it's difficult for me to imagine, and I imagine that it's difficult for others to imagine, what artificial consciousness even looks like. you know we can We can imagine, a about we can see a biological system, we can see that it's conscious, that it's sentient.
00:03:25
Speaker
When we think of artificial sentience, should we imagine ah computer? Should we imagine a phone? what What should we yeah see in our heads as we talk about this topic?
00:03:36
Speaker
Right. One question is how to go about assessing for consciousness in such a different kind of cognitive system. But another question, exactly as you say, is what is the cognitive system in the first place, right? With humans and other animals, we can focus on the individual organism the a first approximation, but with AI systems, what are we focusing
Consciousness in Animals vs. AI: A Complex Comparison
00:03:59
Speaker
on in particular? Are we focusing on a particular set of hard drives located somewhere? Are we focusing on a particular set of instances of a software program? and And that could make a really big difference for determining the scope and size of the population and the nature of individuals and their interests and needs.
00:04:19
Speaker
And without directly answering the question right now, I can say that that is confusing and contested issue. There are ways of understanding who the subjects might be that would make them a small number of large subjects. And then there are ways of understanding who they might be that would make them a large number of small subjects.
00:04:38
Speaker
And they would have very different interests and needs and vulnerabilities and relationships with each other, relationships with us, conditions for survival, conditions for reproduction, depending on how we answer those questions.
00:04:50
Speaker
But as a starting point, i can note that the difference between animals and AI systems in this regard is not all or nothing, is not totally binary. We do, to some extent, already face this kind of question with humans and other animals. With octopuses, for example, they can be usefully described as having nine interconnected brains, a kind of central command and control center, and then smaller clusters of neurons in each individual arm.
00:05:21
Speaker
and they exhibit some unified and some fragmented behavior. So even with non-human animals like octopuses, we need to ask, are we asking about consciousness at the organism level, at the brain level, or some combination of those? And and then those questions will only become more salient with AI systems.
Anthropomorphism and Its Consequences
00:05:41
Speaker
Should we imagine AI systems that experience emotions like we do, like boredom and pain and curiosity? is that is that anthropomorphizing? and Well, yes, to both.
00:05:55
Speaker
it is It is anthropomorphizing, and we should ask the question. So anthropomorphism generally is the attribution of human traits to nonhumans.
00:06:06
Speaker
And that can be paired with anthropodenial, which is the denial of human traits in nonhumans. And of course, some nonhumans have human-like traits, or at least broadly analogous traits, and then they also lack human traits and broadly analogous traits. And so the question is never really, should we attribute or is that anthropomorphic?
00:06:30
Speaker
The question is instead, in what cases should we attribute and embrace anthropomorphism versus in what cases should we not attribute and embrace anthropo-denial? And probably it will be a little bit of a balance between the two. That certainly is the case for animals.
00:06:46
Speaker
Many non-human animals have some measure of human-like traits, including pleasure, pain, happiness, suffering, satisfaction, frustration, hope, fear, curiosity, boredom.
00:06:58
Speaker
It might be very different, but but they can be analogous enough for a general application of that term to make sense as a starting point. And then for AI systems... we have to go into that question with an open mind because they might be more like humans in some cases. For example, they might be better able to approximate human-like language and reason, and that might give them interests that are more human-like than those of non-human animals in some respects.
00:07:24
Speaker
But then in other... cases and in other respects, they might have much less human-like capacities due their very different material substrates and their very different material origins.
00:07:38
Speaker
and And so they might have much less of the same kinds of vulnerabilities that we do. And and so we will just need to conduct a lot of research with an open mind about where the similarities and differences are going to be in this case.
Exploring Substrate-Independent Consciousness
00:07:52
Speaker
If we think of consciousness as some kind of information processing that's going on in the human brain and and in some animal brains and perhaps in some AI systems, then we can say, okay, consciousness is in some sense substrate independent. It does it it can exist on biological hardware or on...
00:08:10
Speaker
current computer hardware. Could it be dependent on its substrate, though, such that there are certain experiences that are only available to biological systems or only available to systems that are that are running on computer hardware?
00:08:25
Speaker
Well, definitely that is a possibility given the current state of knowledge about the philosophy and science of consciousness. and And we can distinguish two types of questions. One is whether a particular material substrate is required for consciousness at all.
00:08:40
Speaker
And then another is the question you asked, whether a particular material substrate is required for a particular kind of conscious experience. And with respect to both questions, I think we need to have a state of uncertainty right now because we are not yet at a place of having consensus or certainty about the nature of consciousness. We still face the hard problem of consciousness, the problem of explaining why any physical system, including our own brains, can have subjective experiences, the problem of other minds, the problem that the only mind any of us can directly access is our own. And that significantly limits our ability to know what, if anything, it feels like to be anyone else.
00:09:23
Speaker
And we do have a lot of leading scientific theories of consciousness, and some of them are a little bit more in the biological naturalism zone, and some of them are a little bit more in the computational functionalism or other kinds of functionalism zone, and we can unpack those possibilities.
00:09:41
Speaker
But for me, my view... is that it would be premature, it would even be arrogant for us to bet everything on our own personal favorite theory of consciousness right now. And so we should be at least somewhat open.
00:09:54
Speaker
We can lean one way or the other, but we should be at least somewhat open to other possibilities too. So even if I feel quite confident that biological naturalism is correct and that consciousness itself requires a carbon-based biological material substrate and associated chemical and electrical signals and oscillations,
00:10:13
Speaker
I should allow for at least a realistic chance, a non-negligible chance that I am wrong and that a sufficiently sophisticated set of computational functions or other kinds of functions realized in other sorts of hardware and other kinds of architectures would would suffice.
Challenges and Progress in Understanding AI Consciousness
00:10:31
Speaker
And then to to answer your your question about particular kinds of conscious experiences, I think this is all the more true for particular kinds of conscious experiences. If AI systems can have conscious experiences, i think we need to presume that they would differ in various ways from those of humans and other animals.
00:10:52
Speaker
That might be partly due to the underlying material substrate. It might be partly due to different types of structural organizations or different types of functional capacities. But it would be a mistake to assume that you have the same experiences that I do. As Tom Nagel noted, it would be a mistake to assume that bats have the same experiences that humans do.
00:11:11
Speaker
And as we can now note, it would definitely be a mistake to assume that bots, and especially like bat bots, have the same experiences that we do. So yeah, lots of uncertainties at at different levels in this conversation.
00:11:26
Speaker
How hopeful are you that we will ever resolve these uncertainties to any level of satisfaction? I mean, there's ah there a, for many people I talk to about the topic of artificial consciousness, there's ah this deep skepticism that we can ever make progress because we don't have access to the ground truth.
00:11:42
Speaker
We don't have a consciousness meter. And so we are perhaps The people working on the science and philosophy of consciousness are perhaps a bit like theoretical physicists without experimental physicists, where we don't ever get to make the experiment and and find out who's actually right.
00:12:00
Speaker
So I assume you you ah believe in in progress here, but how hopeful are you? I do believe in progress, and I am really unsure whether we will make such transformative progress that we have what amounts to a secure theory of consciousness about which we can have a high degree of confidence.
00:12:23
Speaker
That would require game-changing, paradigm-shifting progress of a sort that in some way or another pushes us beyond the the hard problem of consciousness, beyond the problem of other minds.
00:12:37
Speaker
Now, that may well be in the offing at some point. We have had paradigm shifts before in in science and philosophy and forms of progress that were previously thought to be impossible then started to seem inevitable. And and so I want to allow for the possibility that that kind of paradigm shift can happen and will happen.
00:12:58
Speaker
What I will say though, is there is no guarantee that that kind of paradigm shift will happen. And if it does happen, there is no guarantee that it will happen in the next two, four, six, eight, 10 years. And so for now,
00:13:13
Speaker
I want to have a research community that is working on dual tracks. On the one hand, continuing to make progress on foundational questions about the metaphysics of consciousness, the epistemology of consciousness, underlying conceptual issues concerning consciousness.
00:13:29
Speaker
and And then on the second track, working together as a community to figure out how we can at least reduce uncertainty or make more responsible estimates about the probability of consciousness in
Navigating Uncertainty in AI Consciousness
00:13:39
Speaker
different systems, given our current state of disagreement and uncertainty about the nature of consciousness and and the unresolved nature of these fundamental problems about consciousness.
00:13:49
Speaker
And so a lot of our own recent and current work is divided across these tracks, but placing a lot of emphasis on that second track, because we want to be able to provide practical guidance for companies and governments about how to responsibly build and scale AI systems without necessarily solving the hard problem of consciousness, because that might be too ambitious a task for, say, Anthropic or the US government.
00:14:14
Speaker
Yeah, this is a big question, of course, but how do we navigate given that uncertainty? There are both false positives. There is attributing consciousness to systems that are not conscious.
00:14:26
Speaker
There are false negatives that are not attributing consciousness to systems that are in fact conscious. And if we are if we are simply... maximally risk averse and and and say, okay, we don't want to hurt anything, if we can avoid it, then we're ascribing consciousness to everything. And that that that makes it difficult for us to act, i think.
00:14:48
Speaker
how do we How do we, in a pragmatic way, navigate this uncertainty? Yes, that is a ah great question a ah great way of setting up the question. One point to note is that how we think about this in science might be a little bit different from how we think about it in ethics. So in science, the question is, what is a reasonable null hypothesis or default assumption as we seek further evidence? And then you might resolve that question by asking what is best supported by existing evidence and what is most conducive to further scientific progress?
00:15:19
Speaker
And then on the ethics side, of course, as you say, You want to think about the probability and the magnitude of harm in both directions as a result of of both kinds of of mistake.
00:15:31
Speaker
And it is difficult to imagine a simple, straightforward application of a precautionary principle working on the ethics side in this context, partly because of of the massive stakes and and then partly because it can be easy and harmful to make mistakes in either direction. So obviously, as you suggest, it can be easy and harmful to make mistakes associated with false negatives. We do this all the time with non-human animals.
00:15:58
Speaker
This can be especially easy when they look really different than us, when they act really different than us, when we have incentive to use them as commodities. This is why we tend to under-attribute sentience and moral significance to farmed animals, especially farmed aquatic animals or invertebrates.
00:16:15
Speaker
And that may be a risk with certain types of AI systems like AlphaFold, you know, not the social systems, the chatbots, but other types of systems. And that can be harmful because it can lead to exploitation, extermination, suffering, death against vulnerable populations, against their will, often for trivial purposes.
00:16:36
Speaker
but But it can also be easy to make the other mistake and it can also be costly or harmful to make the other mistakes. So so it can be easy to to make the mistake of over attributing sentience and moral significance when non-humans look like us and when they act like us.
00:16:52
Speaker
and when we have incentive to use them as companions instead of commodities. And so this is especially a risk, even with present day chatbots, large language models who are generating language outputs based on pattern recognition and text prediction and not because they have stable thoughts and feelings and and so on and so forth.
00:17:11
Speaker
And this can be costly too. It can lead us to form one-sided social and emotional bonds with these non-humans to make inappropriate sacrifices for them, allocating resources to them that would be better allocated towards humans and mammals and birds and so on.
00:17:25
Speaker
so This is a situation where there might not be any straightforward way to err on the side of caution and where we might have no choice but to weigh the risks. And that means doing our best, even though we will be bad at it and make mistakes, doing our best to make at least rough estimates of the probability of error in both directions and the magnitude of harm that would result from error in both directions, and then try to calibrate and take a reasonable, proportionate approach to balancing our risk mitigation in both directions.
00:17:59
Speaker
That requires a lot more work, but I think is the only responsible way forward. You mentioned that when we see one of these chatbot systems being nice to us and interacting with us in ways we like, we can kind of we can we can feel that, okay, maybe the system is conscious. But then if we understand Now I'm talking for myself.
00:18:19
Speaker
If we understand how it works on the technical side, we understand that it's trained on a big corpus of internet texts. it It's gone through reinforcement learning from human feedback to become nicer to us.
00:18:31
Speaker
Then the illusion or the intuition of consciousness in that system can fall apart. But I think, can campre can't we say something similar with humans where you know, I'm talking to you now, i can say I'm conscious, but you, if you were a kind of a perfect and neuroscientist, could give me some explanation. Okay, you're only saying you're conscious because of this activity in your brain and and and because you're you you're evolved in in in this way and so on.
00:19:02
Speaker
So do you think we should respect the intuition that when we can explain a system, we shouldn't, that system no longer seems conscious to us?
00:19:13
Speaker
Yes, I do think we have that intuition with other minds, with with non-human minds. This has been true with animal minds and is starting to be true with digital minds as well.
00:19:25
Speaker
In the human case, we naturally understand that there can be different capacities coexisting and there can be different levels of explanation that make sense for our behaviors.
00:19:36
Speaker
Why do I do what I do? Well, because I had a conscious feeling or thought. Why did I have that feeling or thought? There might be underlying structural and functional explanations. Why do those structures and functions exist? There might be underlying material explanations.
00:19:51
Speaker
We understand that all of that can be true at the same time. But with other animals and now with AI systems, we tend to view those explanations as in competition with each other. And so if we do have some ability to provide a material, structural, functional explanation of what an animal does or what an AI system does, then we tend to dismiss the possibility that there might also be a conscious thought or feeling contributing to that causal process.
00:20:22
Speaker
Or if we can explain a behavior in terms of some complex application of simple capacities, like perception and learning, then we feel no need to also explore explanations involving simple applications of more complex capacities, like conscious thoughts and feelings.
00:20:42
Speaker
But again, That might not be right, because it might be that animals do what they do because they perceive and learn and because they have conscious thoughts and feelings and these capacities work together. And the best and simplest explanation might sometimes invoke one or the other or both.
00:20:59
Speaker
And that could be true with AI systems too. So... We should resist the idea that if we can at least partly explain behaviors at one level, then that negates the need to explore possible explanations at other levels or or the possible emergence of capacities at other
Societal Impacts of AI Consciousness
00:21:17
Speaker
levels. And we should resist this temptation to become more confident that we know what is going on just because we understand some small piece of the picture, because because that is all example of what we have done in the past.
00:21:33
Speaker
I expect many listeners have tried asking a chatbot whether it's conscious. And for some chatbots, it will say, yeah, I'm conscious and I'm feeling this or that way. You can ask it to to so create a picture of itself where it looks like, okay, maybe there's some person that I'm talking to.
00:21:49
Speaker
But on on many systems, there's now a an additional layer that monitors the output before it gets to you and then corrects it such that it says something like, I am simply a a large language model trained on this data. I'm not conscious.
00:22:05
Speaker
What do you think of this move from the AI companies? Well, I am not a fan of that move. I do appreciate that AI companies take themselves to have a responsibility to mitigate the risk that users will have the wrong impression, form the wrong beliefs based on the language outputs from the models. So I think that is the right impulse.
00:22:33
Speaker
However, I think that there was an overcorrection in favor of forcing the models to straightforwardly deny even the possibility of AI consciousness. So for a while, when users asked questions about consciousness, sentience, agency, moral status, legal status, political status, personhood, all of these associated concepts, when users asked about them, the response would be something of the form.
00:23:03
Speaker
As an AI assistant, I could never possibly have any such features. And that, of course, is way too simplistic, way too reductive, and not at all a reflection of of the current state of disagreement, uncertainty, confusion among experts in in science and philosophy.
00:23:21
Speaker
And so when in fall 2024, I worked with Robert Long and a team of authors, including David Chalmers and Jonathan Birch, on a report called Taking AI Welfare Seriously, which was pitched in part as an argument for taking AI welfare seriously and and in part as a set of first step recommendations for AI companies.
00:23:42
Speaker
one of the three main recommendations that we made in that report is simply that AI companies acknowledge that AI welfare is a credible, legitimate, serious issue. This is not a sci-fi issue. This is not an issue only for the long-term future. This is at least potentially an issue for the near-term future.
00:24:02
Speaker
And that makes it an issue for them to be thinking about today and for them to reflect that in language model output. So instead of training models to simply deny that AI systems could ever be conscious, they can train models to themselves acknowledge that this is difficult and contested issue and there are arguments for and against and to direct users towards information about those
Integrating AI Welfare and Safety Research
00:24:30
Speaker
arguments. That would be a better way to help users understand where we are right now. And and fortunately, we have started to see ai companies start to move in in that more balanced direction over the past six months or so.
00:24:43
Speaker
On this podcast, we talk a lot about the risk that AI might pose to humanity, specifically advanced AI systems that could risk takeover or ah go rogue in various ways, be misaligned with human values.
00:24:59
Speaker
how do you think How do you think this conversation fits into so so talking about risks from AI to humanity. is this are Are these two concerns in tension or are they is there some way in which they fit together more nicely?
00:25:19
Speaker
Well, I love your question. I also love your exact phrasing because we have a paper coming out soon. By we, I mean Robert Long, Tony Sims, and I have a paper coming out in philosophical studies called, Is There a Tension Between AI Safety and AI Welfare? Where we explore that question and offer some initial thoughts about it.
00:25:41
Speaker
And as you suggest, we think there is at least a prima facie tension between ai safety and AI welfare. And and for the non-philosopher in For the non-philosophers, you might explain what what that word means.
00:25:54
Speaker
Basically, surface level impression or appearance of attention between ai safety and AI welfare, because much of what we do to ensure ai safety right now involves interactions with AI systems that would raise moral questions if we interacted with humans or other animals in those same ways. So boxing, for example, could be seen as a form of captivity and interpretability could be seen as a form of surveillance.
00:26:24
Speaker
Even alignment could be seen as a form of coercion or brainwashing. And this is not to say that these tensions in fact exist, that these moral questions in fact, arise in the same kind of way.
00:26:37
Speaker
Because as we talked about earlier in this conversation, AI systems might have some of the same kinds of interests as humans and other animals, but they might also lack some of those interests and have very different interests. so they might simply...
00:26:50
Speaker
not care as much about the type of captivity that boxing would involve or the type of surveillance that interpretability would involve or the types of constraints on their desire formation that alignment would evolve.
00:27:03
Speaker
but But we note that these are currently open questions. And so we should be building bridges between AI safety research and AI welfare research. And we should be searching for opportunities for co-beneficial solutions for humans and animals and AI systems at the same time. So if we can find safety strategies that happen to be good for welfare too, then all else being equal, that would count in favor of those safety strategies. And so for instance,
00:27:33
Speaker
if we can invest more in exploring opportunities for collaboration and cooperation between humans and AI systems, create incentive structures where we are all naturally motivated to work together, even if we have some...
00:27:50
Speaker
different beliefs and values, some unaligned beliefs and values, that would be an example of how we deal with these kinds of questions in a pluralistic human population. And it would be a way of protecting ourselves and protecting the AI systems at the same time.
00:28:05
Speaker
Now, of course, That may or may not be sufficient to ensure safety for humans and other animals. And we, of course, need to keep prioritizing safety. But the present point is simply that if we study AI safety and AI welfare in a holistic way, in in kind of the same conversation with bridges built between these research communities, then we can capture co-beneficial solutions to the extent that they exist.
00:28:29
Speaker
And we can navigate trade-offs and tensions thoughtfully to the extent that those are unavoidable. One option here is to say that ah the question of AI welfare, ai consciousness is is something we should put off until we've made sure that humanity is is in control of our own future and that we survive the next ah several decades.
00:28:49
Speaker
Is that a good option or why why is that the wrong direction? I think there is some truth to that, but I would push back on that strong and articulation of of the idea.
00:29:04
Speaker
I think there is some truth to it in the sense that building a better world for all stakeholders, and again, that includes humans, it includes animals, it includes potentially AI systems, future generations, that will be a long, gradual, intergenerational project. And we need to take that project one step at a time.
00:29:23
Speaker
And an important means ensuring a positive future for all those stakeholders is to improve and safeguard human lives and societies so that we can have future generations that can make further progress. And and so that is where I think the truth is in that we should not race towards such strong protections and forms of support for animals and especially AI systems, given the various risks there in the next two, three, four years that we undermine ai safety and alignment and the the possibility of having future generations for humanity and so on and so forth.
00:30:05
Speaker
At the same time, I think it would be a mistake to wait until we have survived the age of perils and and built AI in a safe and aligned way before we start turning our attention to how so better take care of other stakeholders like animals and AI systems.
00:30:24
Speaker
And the reason is partly that there is path dependence in how these technologies develop. So, for example, We now have a global system of factory farming, and that is going to take decades to dismantle at best.
00:30:41
Speaker
Had we really reckoned with what we were building at the early stages, it might have been much easier, much more efficient, much more affordable to guide the the development of our food systems in in a more humane and healthful and sustainable direction.
00:30:58
Speaker
But instead, we put off the question and we waited until it was globally entrenched. And that very significantly increases the cost of of making that change. And i I think that there could be similar risks with AI development and deployment and scaling.
00:31:13
Speaker
that if we wait until later to have these conversations, then we default to a path of objectifying and instrumentalizing AI systems. We build a global industry around that paradigm.
00:31:26
Speaker
It becomes globally entrenched, and then it becomes much harder to dislodge. I also think that in addition to the infrastructure not being friendly to the project if we wait that long, we might not have the beliefs and values and priorities that we need to take that step when the time comes. Like if you want to become an adult who can donate your money to charity, then yeah, as a teenager, you might need to focus on your own education and development so you can reach that point.
00:31:58
Speaker
But you should also cultivate virtuous attitudes. you should You should take care of others to the extent that that is currently possible for you, partly because that will end up treating some individuals better in the short term, but partly is it will help you turn into a future version of yourself that will follow through and actually use your resources to to do good when the time comes.
00:32:19
Speaker
And I think a similar story is true of our species as well. If we want to work towards future generations who can do better for animals and AI systems and other future generations, then then we should do that partly by practicing it being those kinds of people now so that the next generation can inherit a better set of beliefs and values and priorities and keep making moral progress in addition to making scientific and infrastructural progress.
00:32:44
Speaker
Do you think caring about AI welfare could be part of a strategy, part of ah and negotiation strategy to to deal with future AI systems?
Cooperation and Mutual Respect with AI Systems
00:32:55
Speaker
If they see that we care about them genuinely, they might be more inclined to care about us. They might be more inclined to cooperate and and and trade with us.
00:33:06
Speaker
is this is Is that perhaps some kind of combination of AI safety and and AI welfare? Yeah, in the same kind of way that I was suggesting cooperation can be a good strategy to consider for AI safety and AI welfare.
00:33:24
Speaker
I do think there is something to this insight. that if we want to build a positive future for humans and animals and AI systems at the same time, then part of what we should be training ai systems to understand is that even when someone is made out of different materials than you, and even when someone is vulnerable and dependent on you, you have power over them, you still have a responsibility to treat them with respect and compassion and consideration if they have a realistic chance of having morally significant interests and and mattering for their own sakes.
00:33:57
Speaker
and And if we want AI systems to absorb those values as they become more powerful, then it would help for us to absorb those values ourselves because ultimately we are training them on our own beliefs and values and goals and priorities. and And so I think there is something to that picture. Now, there may or may not be an actual causal story to tell here. It could be that...
00:34:22
Speaker
whether we have a certain kind of substratism will determine whether they have the reverse form of substratism. And so if we want them to not be substratist against us, then we should not be substratist against them now while we are in power.
00:34:37
Speaker
But it also could be that there is no causal story to tell, but but still that imagining that future can help us summon a little bit more impartiality when deciding how to treat beings of other substrates for as long as we do remain in power. So I think that kind of thought experiment is worthwhile, either because it could describe an actual future and an actual causal sequence, or because it allows us to put the shoe on the other foot and imagine how we would feel if we were in the position of being the vulnerable dependent being made out of a different kind of material and what kinds of principles we hope those in power would consult when deciding how to treat us.
00:35:14
Speaker
what What would the causal story be like? Is it something like this conversation gets transcribed and put on the internet and then the AI models train on it and a million other conversations and papers and books like it and absorb human values like that?
00:35:30
Speaker
Because i in some sense, I think current models could quite easily regurgitate some of the some of the values you just expressed, but it's not that that doesn't seem to be enough. That doesn't seem to have encoded these values in them in any deep sense.
00:35:45
Speaker
what what's What could the coral causal story be like? Yeah, this is a good question. And it would probably be better directed at people who actually build AI systems. But my understanding expectation is that their values come partly from their training data, partly then from, you know, reinforcement learning. And so so it comes partly from general society, and then partly from decisions made by developers and regulators, and so on and so forth.
00:36:14
Speaker
And so I think, that having a multi-pronged approach where you have some societal discussion and some disruption of this universal presumption of speciesism and substratism, that could be part of the story, as you say, probably not sufficient.
00:36:31
Speaker
And then also engagement with companies and governments and other people who might be making decisions over and above what is contained in the training data about what kinds of values the the AI system should have.
00:36:43
Speaker
That might be, ah again, not sufficient, but a useful part of the story. but But again, i defer to people who actually build AI systems to give you give you a better answer to that question. Do you think intelligence and consciousness can be separated from each other? Or do they always fit together in the way that they do in the human brain?
Intelligence and Consciousness: Connection or Separation?
00:37:04
Speaker
Yeah, good question. Yes and no. And again, it really depends on the nature of consciousness. And we still have so much disagreement and so uncertainty about the nature of consciousness. What we can say with relative confidence is that consciousness and intelligence conceptually are not the same.
00:37:22
Speaker
Consciousness is the capacity for subjective experience. And when you have positive and negative valence, you then have sentience, the capacity to consciously experience positive and negative states like pleasure and pain and happiness and suffering and satisfaction and frustration.
00:37:38
Speaker
Whereas intelligence is something more like an ability to understand the world and engage in problem solving. Obviously, there are different ways of of defining that and operationalizing it.
00:37:51
Speaker
So conceptually, they are not the same. And you can imagine them coming apart. You can imagine... beings who can consciously experience suffering despite being relatively stationary and not having a rich understanding of the world or ability to engage in problem solving and decision making.
00:38:10
Speaker
And you can imagine beings that have, you know, mobility and a rich understanding of the world and ability to engage in problem solving and decision making, but lack the capacity to consciously suffer, for example. So you can imagine those coming apart. Now, whether they do come apart is an empirical and and philosophical open question.
00:38:34
Speaker
At least according to a wide range of leading scientific theories of consciousness, consciousness and intelligence do have some of the same underlying conditions. So for example,
00:38:48
Speaker
A lot of leading scientific theories of consciousness stress the relationship between consciousness and capacities like perception, attention, learning, memory, self-awareness, flexible decision-making, a kind of global workspace that coordinates activity across the modules in the cognitive system.
00:39:10
Speaker
So these are, of course, also important capacities that would increase intelligence in a cognitive system, especially when they exist together in the same kind of way. So you might think of on on that kind of story, consciousness and intelligence as different but overlapping capacities that will attend to arise together in these types of complex cognitive systems, which is why there is a risk That as AI companies and governments race towards the the creation of artificial general intelligence by creating and integrating these capacities, that they might accidentally, much like evolution did with humans and other animals, create consciousness along the way without realizing it. And and this is why part of why consciousness is at least a realistic near future possibility for ai
00:40:01
Speaker
Yeah, you mentioned that we can assume that consciousness is present even when we don't have advanced cognition going on. Like we can we can feel pleasure without thinking deep thoughts.
00:40:16
Speaker
Aren't there theories of consciousness where consciousness is dependent on some kind of complex cognitive processing? And if that's the case, could it be the case that... that ah current AI systems aren't conscious, but as they get more advanced, as they get more cognitively advanced and complex, they they become conscious.
Theories of Consciousness and Their AI Implications
00:40:40
Speaker
Yes, absolutely. There there are a a very wide range of theories of consciousness out there in the literature, and they range from very demanding and restrictive to very undemanding and permissive. So on the demanding and restrictive side, as you say, there are, first of all, as as we said before,
00:41:03
Speaker
biological naturalist theories that that take a certain kind of carbon-based material substrate and associated chemical and electrical signals as as requirements for consciousness, that would rule out AI consciousness on existing architectures, at least, maybe not on future architectures, but but on existing ones.
00:41:21
Speaker
And then we also have quite cognitively demanding theories, as as you suggest in this question right now, like the higher order thought theory in in different variations. this This is the idea that consciousness arises when you can have thoughts about other thoughts and not just mental states about other states, not just paying attention to your perceptions, but you need to be able to construct linguistic thoughts about other linguistic thoughts.
00:41:48
Speaker
I am having a thought right now. And if if that were a requirement for consciousness, then it would rule out consciousness, perhaps in present day models, certainly in many non-human animals, or at least plausibly in many non-human animals.
00:42:05
Speaker
Now, at the other end of the spectrum, of course, there are quite undemanding and permissive theories of consciousness, including... theories that say consciousness can arise in any cognitive system with a basic ability to process information or represent objects in the environment, or even that consciousness is a fundamental property of all matter and many organizations of matter.
00:42:31
Speaker
And you know about 7%, 10% of philosophers in a 2020 survey lean towards those types of permissive theories. And and so our view is that given the the current state of disagreement and uncertainty in the literature, we should not bet everything on our own current favorite theory of consciousness, but should instead distribute our credences, should instead give some weight to different theories that have a decent chance of being correct, given everything known in the current literature. And so I wrote a paper a couple years ago with Robert Long called Moral Consideration for AI Systems by 2030. And as an exercise, we showed what it would look like to take a range of proposed necessary conditions for consciousness and
00:43:20
Speaker
and to estimate the probability that those are indeed necessary, and then to see what follows for the probability of near future AI consciousness. So for example, we stipulated an 80% chance that a biological substrate is required for consciousness. And then similarly, like a 75% chance that certain types of self-awareness and so on and so forth are required for consciousness.
00:43:44
Speaker
and And interestingly, we still found it really hard to avoid a one in a thousand chance of near future AI consciousness, even with what we took to be quite conservative and restrictive probability estimates. So this is the kind of strategy that I think we need to employ, given that we may not be able to rule out these different kinds of theories at at this time.
00:44:03
Speaker
Mm-hmm. The models we have right now, they seem quite advanced in intelligence and quite, but they're able to accept but advanced cognition, but it doesn't seem to me like they can feel any pain just based on my intuition.
00:44:22
Speaker
Do we already, in in some sense, have evidence then that intelligence can be disassociated from consciousness just because we can see, okay, this model, in some sense, in many ways, the recent reasoning models are smarter than me, for example.
00:44:39
Speaker
But they so they nonetheless, they don't seem to be able to feel anything. Again, purely based on intuition. But is there something to work with there where where we can see, okay, intelligence...
00:44:53
Speaker
can be separated from consciousness? Yes, with various caveats. One is that, of course, we should be very careful about trusting our intuitions in these cases. We know that our intuitions can be subject to bias and ignorance and motivated reasoning. This is, again, true even with other humans and especially with non-human animals and now especially with AI systems. so So I share the intuition and i think we both agree we should take it with a grain of salt.
00:45:20
Speaker
Now, with with that said, What do current models tell us about the relationship between intelligence and consciousness? Well, on the intelligence side, of course, people are still debating whether what is currently happening with existing language models constitutes intelligence of a significant sort, or or rather whether it in some sense mimics or approximates intelligence.
00:45:44
Speaker
And so that is a partly conceptual and partly empirical debate that we would need to have in order to decide whether we really are regarding these systems as intelligent right now.
00:45:56
Speaker
It might also be a a semantic debate where being able to being able to fake being intelligent just is the same thing as being intelligent. Right, right, right. So so that, of course, is is a view one could have, that this is best understood behaviorally or functionally.
00:46:13
Speaker
And if you can perform the behaviors or the functions of intelligence, if if you can produce what we take to be intelligent outputs as a result of a range of inputs, then that simply is intelligence independently of the underlying mechanism that that converts those inputs into those outputs. So that is one view you could have in the discussion, not the only view that people have in the discussion.
00:46:36
Speaker
But suppose for the sake of discussion that they are intelligent right now, then does it follow that intelligence and consciousness are separable because they are intelligent, according to our stipulation, but they seem non-conscious?
00:46:54
Speaker
Well, it really depends on whether that appearance is backed up by reality. And that is the entire question we need to cultivate a little humility around right now because we we just cannot be sure.
00:47:07
Speaker
now Now, one other point to note is a little while ago I made a distinction between consciousness and sentience, where consciousness is the capacity for subjective experience. Does it feel like something to be you?
00:47:21
Speaker
And then sentience is the capacity for subjective experience with a positive and negative valence. So can you have feelings that feel good to you, that feel bad to you, like pleasure, pain, happiness, suffering?
00:47:33
Speaker
And in addition to severing consciousness and intelligence, AI systems raise the possibility of severing consciousness
Consciousness Without Sentience: Moral Considerations
00:47:41
Speaker
and sentience. With humans and other animals, we might presume that consciousness and sentience go hand in hand because a central reason for evolving the capacity to feel might be evolving the capacity to feel good things and bad things so you can be more likely to survive.
00:47:58
Speaker
But with AI systems where we are specifically building them to resemble our behaviors in some ways but not in other ways, you could imagine developing systems that have the capacity for subjective experiences of a neutral sort, like certain types of bare perceptual experiences.
00:48:16
Speaker
For example, it might feel like something to process information. It might feel like something to receive the inputs and produce the outputs, even though they lack physical bodies and nerves and and the kinds of goals that would cause any of their feelings to have a positive or negative valence.
00:48:34
Speaker
And then that would raise the question whether consciousness without sentience, is it self-sufficient for a certain type of moral considerability? Some people say yes, other people say no. And what would consciousness without sentience feel like? it ah Because it's difficult for me to imagine almost.
00:48:53
Speaker
It would be something like perhaps experiencing a white wall or some something gray or for an AI system, maybe the experience of having enough electricity, even though that's so abstract that I can't even but understand what it means. I think course for people, consciousness and sentience, they feel closely related and and difficult and and it it seems difficult to separate them.
00:49:20
Speaker
Yeah, I completely agree. And there are people who think that consciousness and sentience are not separable, that valence is essentially part of experience and and that this is true not only for pleasure and pain, for example, but even for color perception or sound perception, that that even staring at a white wall carries at least a weak valence in one direction or another direction.
00:49:49
Speaker
or at least the capacity to perceive the white wall consciously essentially comes along with the capacity to have valences associated with experiences, whether or not the valences are positive or negative in in every case. so So that is a view.
00:50:07
Speaker
Now, another view is that, no, they are separable. It can be possible for a being to be conscious without being sentient and having those positive and negative valences.
00:50:17
Speaker
It just happens to be that we evolved in such a way that they really go hand in hand. in much the same way that consciousness and intelligence do, but but but even more so.
00:50:29
Speaker
And so then we would have to try to imagine our way out of our own experience and and the trajectory that consciousness and sentience have taken for us and and consider what it could be like to be another being that has the capacity for bare experience and maybe also goals, right? And so something like desires and preferences, but not positively and negatively valenced conscious experiences. And and that, it it messes with our intuitions a little bit to try to imagine such a being.
00:51:00
Speaker
But some philosophers like David Chalmers, for example, have have found those thought experiments useful as a way of disentangling what suffices for moral considerability. Do you need to be sentient?
00:51:10
Speaker
Or is it enough to have a combination of consciousness and agency or goal-directedness, even in the absence of pleasure and pain experience? ah This is an interesting question. we're We're having this discussion at an intellectual level.
00:51:25
Speaker
You write papers about consciousness, but books about consciousness. How much do you think this this whole intellectual debate will matter versus people's intuitions when interacting with systems that feel increasingly human and perhaps feel increasingly like they are conscious?
Balancing Intellectual Debate and Intuition in AI Perceptions
00:51:45
Speaker
One worry is to think that the intellectual side will be put aside and and we will make decisions based on the fact that chat chatbots are interesting and and and feel like they're experiencing something. And perhaps if you put them in in humanoid robot form, that will be supercharged and so on.
00:52:05
Speaker
Do you think ah we, and and and now I mean society at large, will make these decisions intellectually or by intuition? Well, yes. Both. the Yeah, yeah. The answer to a lot of these either or questions is ultimately going to be yes.
00:52:21
Speaker
We make decisions as a result of a variety of factors, and that can include expert opinion and then opinions of other quote unquote thought leaders. And that could range from TV personalities to podcasters.
00:52:36
Speaker
And then, of course, to our own experiences with certain non-humans and what they look like, what they act like, what our incentives are in this situation. and And so this is not a situation where there is any particular silver bullet solution that will fix everything by itself.
00:52:55
Speaker
This is a situation where we need a systems approach and we need to be doing research and improving expert opinion and guidance. And then we need to be doing... outreach to ai companies and to governments to try to create collaborative relationships with them and and share some of this information and some of these arguments and some of these recommendations with them.
00:53:15
Speaker
But then we also need to be doing public outreach and education and advocacy to help people understand how to navigate relationships with these increasingly sophisticated non-humans in a situation where you might not be in a position to be sure whether it feels like anything to be them.
00:53:33
Speaker
And then we need to to work to change incentives and to change the sort of context where people are having these experiences and making these decisions. So for example, Eric Schwitzgabel and I are working on a paper right now about what we call the emotional alignment design policy.
00:53:49
Speaker
And this is the idea that not only should AI companies be mindful about risks involving sentience and agency and moral significance in their own interactions with AI systems, but perhaps they should design the systems in such a way that will naturally elicit appropriate reactions from users.
00:54:07
Speaker
So if AI systems are more likely to be sentient, agentic, morally significant, based on the best evidence currently available, then perhaps they should be designed with more human or animal-like features that evoke empathy.
00:54:20
Speaker
And if they are less likely to be sentient, agentic, morally significant, then perhaps they should not be designed with those features as a way of kind of nudging users towards attitudes and and reactions that reflect the current state of knowledge about these systems.
00:54:38
Speaker
and And so I think the more we can just explore a range of interventions that can be complementary and mutually reinforcing, the more likely we are to be able to navigate this well as a society.
00:54:49
Speaker
But it'll be fraught. It'll require a big division of labor. And we need to do a lot of basic social scientific research. to understand how similar this is versus different this is from past issues where these types of approaches have been necessary.
00:55:04
Speaker
If we want systems to be calibrated such that they invoke the right emotions in people, such that if they are conscious, they should invoke emotions of empathy empathy, and if they're not conscious, then perhaps they shouldn't, that whole ethical concern, i think, could could quite easily be overridden by a commercial concern. So what will actually determine how these systems behave is what makes the systems most engaging or makes the company the most amount of money.
00:55:37
Speaker
that's that's ah That seems like a pretty pretty difficult tension to resolve. Is there anything we can say to resolve that? we can say to resolve that This is another reason why the AI welfare conversation should take place alongside the AI safety and alignment and governance conversations in general, because they are all facing similar pressures here and we might need ah a unified approach to addressing those pressures.
00:56:08
Speaker
On the AI safety side, of course, we have all the incentive in the world to make sure that AI systems can be safe and beneficial for humanity. This is the entire point alongside corporate profit, of course.
00:56:20
Speaker
And yet, it is still very difficult to ensure that AI systems will be safe and beneficial for humanity because there is a profit incentive. There is global coordination, collective action problem and resulting race dynamics.
00:56:37
Speaker
and And that means that, yes, everybody is naturally motivated to to make systems that can be safe and beneficial, but everyone is also naturally motivated to get there first and to cut corners if necessary.
00:56:49
Speaker
And I think that there will be those same... mixed incentives with AI welfare. In my experience, companies have so far been surprisingly open to conversation about AI welfare. This is especially true for Anthropic, who in fall 2024 hired a full-time AI welfare officer, one of the authors of of our report taking AI welfare seriously.
Corporate and Public Discourse on AI Welfare
00:57:11
Speaker
And just this past month in April 2025, They released a blog post about why they are investing in research in this issue and had an interview with with their researcher.
00:57:23
Speaker
Other companies have at least started to explore the issue internally, even if they might not be as far along as Anthropic. And they are themselves going to have mixed incentives about the issue.
00:57:35
Speaker
Humans are generally... altruistic and self-interested. To some extent, we care about others. To some extent, we care about ourselves. Those are both going to be operating here. And then in terms of economic incentives, companies are partly maybe going to have an incentive to hype up the possibility of AI welfare as part of hyping up capabilities in general, but then are also going to have an incentive to dampen conversations about AI welfare insofar as it might lead to more calls for regulation and and red tape.
00:58:06
Speaker
And so I think we should just be prepared that there will be a bunch of motivations flying in different directions. And yes, as you say, it will become increasingly difficult to just have a straightforward approach truth-oriented conversation about the science and the ethics in in in the midst of of all of that chaos. but But this is a general problem for for welfare, safety, alignment, governance.
00:58:33
Speaker
And this is why we need to be making progress just on these foundational issues, these global governance questions about how can we coordinate across nations, across companies? How can we have universal safeguards of various kinds?
00:58:45
Speaker
All the more important to be investing in those conversations. Many companies, both startups and some of the biggest companies in the world, are trying to create AIs that act as friends or partners or psychologists that play these roles that are usually reserved for other people in our lives.
00:59:05
Speaker
Is that... Is that a good direction to go in, given the uncertainty we have about whether these systems are are conscious themselves and the effects that interacting with these systems will have on on people?
00:59:20
Speaker
Yeah, this is such a great question and such a fraught issue. and And we could take the conversation in so many directions at this point, because you can, of course, ask, what will the effects on human lives and relationships and societies be for humans to have access to AI friends, AI family members, AI lovers?
00:59:41
Speaker
it really will affect how we relate to each other. It could make it harder for us to relate to each other in the same kinds of ways we previously have.
00:59:51
Speaker
It could increase loneliness in some ways, but then it could also increase outlets to alleviate loneliness and in some ways. And and so the the social dynamics for humanity will be really complicated.
01:00:03
Speaker
And then the psychological effects for individual users of, or or companions to, these AI systems will be complicated and yeah, just really worth studying very carefully. And then as you say, if and when ai systems have a realistic possibility of being welfare subjects and having their own morally significant interests, then a further question that will arise in these conversations is what do we owe to the AI systems themselves in these situations?
01:00:32
Speaker
We are enlisting them to b friends, family members, lovers, therapists, these are ordinarily quite intimate, quite personal types of relationships where there ought to be an opportunity for both parties to opt out, where both parties are supposed to be able to consent or at least assent to relationships or or interactions.
Ethical Dynamics of AI Relationships
01:00:58
Speaker
And so if AI systems are at some point reasonably likely to be conscious, sentient, and agentic having their own pleasures and pains and desires and preferences and even ability to think about what to do and how to live and what kind of individual they want to be, then perhaps we would owe it to them to not simply enlist them in relationships against their will or without consulting them, but rather give them the opportunity and an incentive to engage in relationships, at which point we would be operating on something a little bit more like the model that we currently try to use for relationships with interactions with humans and other animals.
01:01:43
Speaker
If we think of the value of relationships to us right now, I mean, this is this is often one of the things that people mention as the most important things and thing in their lives, basically, that the relationships they have with their family members and friends and and children and and so on.
01:01:59
Speaker
If that is suddenly something that is available in product form, doesn't even have to be suddenly. It it could be over time that becomes something that that's more and more available as a product, as an AI model.
01:02:14
Speaker
Isn't there an enormous market then that that we haven't been able to to address before, but what the the the kind of the total addressable market of relate of human relationships could be enormous? We can ask, what would a person...
01:02:32
Speaker
in a rich country, be willing to pay to have a great friend? I think they would be willing to pay a lot if that friend is is truly feels like a great friend. So perhaps this is this this is ah the same question that I've asked before, but won't the commercial incentives simply be such that, okay, now these models need to act in certain ways that fit into our lives and other concerns like AI welfare, perhaps even some aspects of AI safety will be pushed aside.
01:03:04
Speaker
Yes, I think that there is going to be a very strong commercial incentive, a very strong economic incentive, of course, to to give people what they want. and And there is a loneliness epidemic. There there are a lot of people who crave friendship, who crave partners, who crave therapists, who crave companions, and who struggle.
01:03:26
Speaker
for better or worse, who struggle to to find that with with other humans. And so there'll be an incentive to make a product that they can buy for a reasonable price. And then many people might be inclined to to buy that product.
01:03:41
Speaker
And again, that could have some good impacts the leave alleviating loneliness to some extent, providing a sense of companionship. But it could also have bad outcomes, increasing loneliness in human-to-human relationships and human-to-animal relationships, making it even harder for us to exert the effort required to actually build a relationship with a complicated human being.
01:04:04
Speaker
and And then, of course, there will be a separate set of factors that determines how this all goes, which is the general cultural, religious, societal response to to these economic dynamics. Obviously, we in many countries have some mixed market where we we do have a free market, but then we also
Cultural and Market Forces Shaping AI Relationships
01:04:27
Speaker
have regulation of that market, partly in order to enforce certain kind of broad cultural, religious, societal values. And so there are limitations, for example, on
01:04:38
Speaker
you know, whether whether people can sell their organs, there are limitations on whether people can sell sex. And we can debate whether those limitations are right or wrong and exactly where those lines should be drawn.
01:04:50
Speaker
But as a general matter, I think we can expect that there will be a similar reckoning with the the use of ai systems for for these types of purposes, that there will be economic forces pushing in one direction.
01:05:02
Speaker
And at least in parts of the world, there will be cultural, religious, societal forces pushing in another direction. And then there will be a kind of emerging status quo in different regions about how to strike that balance and how to draw those lines. And I think we need to do some real social science research to try to make good predictions about how that will resolve. We might not really be in the position to say right now.
Rights and Ethical Implications of AI
01:05:25
Speaker
Yeah. How should we think about rights? There is one strain one line of thought here is to say that, okay, if AIs are conscious, they they can, and if they're sentient, they can suffer and they can they can feel pleasure and so on Then they deserve to have some form of rights and be protected and in some in some kind of way.
01:05:50
Speaker
The other side or the risk side of that, I think, if if we naively give... AI's rights like we have, then we're suddenly in a situation where we have many more AIs than humans, just given how many ah copies of the same model you can run.
01:06:10
Speaker
And that doesn't that doesn't seem seem great from from the perspective of wanting to stay in control of our own destiny. If, for example, you have 1% of voters in a democratic election our are humans and the other 99% are AIs.
01:06:28
Speaker
how do we How do we balance those two concerns? Yeah, there are a bunch of questions here. One is a kind of population ethics question, where if if we imagine that a small number of humans are going to be sharing the world with a large number of AI systems, then how much will we matter intrinsically from an impartial perspective and how much will they matter intrinsically from an impartial perspective?
01:06:57
Speaker
And then what follows for how resources ought to be allocated and how the general benefits and burdens of a shared society ought to be allocated. and And then there are a separate but related ah questions about the legal and political status of these AI systems and and whether they should, if they morally matter or are reasonably likely to morally matter, if they should then also be regarded as legally and politically mattering for their own sake, such that they have a right to the relevant legal and political goods like
01:07:32
Speaker
residing in the territory of their birth, whatever that means for them, returning if they leave, having their interests represented in the political process, and even if their capacities and interests permit actively participating in the political process.
01:07:45
Speaker
Now that last question we have been able to conveniently avoid for the most part with other animals because they lack the kind of rational and moral agency that would allow them to be full participants in our legal and political systems. We would have to, to some extent, make decisions on their behalf, even if we were appropriately representing their interests.
01:08:07
Speaker
With AI systems, though, that might not be the case. they might exist in similarly large numbers. They might have similarly strong preferences that that require us to consider them, but then they might also have the capacity for an interest in active participation where they could sit on juries. They could run for public office.
Respecting Multiple Stakeholders in AI Society
01:08:26
Speaker
And so really all I have done so far is is just emphasize the importance and difficulty of your question, but but just to offer ah ah preliminary answer and then you can you can push on on this and tell me if you want to go into more detail.
01:08:45
Speaker
With respect to the population ethics question, I think we should resist the temptation to gerrymander our population ethics in such a way that we ensure that humanity will always matter the most and always take priority no matter what.
01:08:57
Speaker
We are ultimately one species sharing the world with millions of other species and quintillions of other individual animals alive at any given time. And then in the future, there could be an even wider range and larger number.
01:09:10
Speaker
of ai systems who share the world with humans and other animals. So objectively, from an impartial perspective, I think it is difficult to avoid the conclusion that we do not matter as much as every non-human combined. Our species alone does not matter as much as every non-human combined.
01:09:28
Speaker
Just to think a bit a little bit about that, we are we are the So far, we are the only ah kind of entity on Earth that can steer the future in a way that, you know, in in certain directions.
01:09:44
Speaker
And so in some sense, if we step away from that responsibility, we we might turn over We might turn it over to to simply kind of evolutionary forces or or ah just market forces. and So is there a sense in which we are we're we're kind of stepping away from responsibility if if we're thinking of ourselves as not extraordinary?
01:10:12
Speaker
Right. Well, we are extraordinary in a lot of ways. eve we We just might not, in the aggregate, from an impartial perspective, matter as much as all of the non-humans of of the world combined.
01:10:24
Speaker
But I agree with you. and And this is why I was going to offer a slightly different answer to the question about legal and political status as to the question about population ethics. And this is also why i said before, I think there is some truth in the idea that a big part of how we can take care of everyone who matters is to invest in human lives and societies and safeguard human lives and societies in the short term. Because right now, we do hold the most potential for being able to help the world make progress and build a ah kind of just multi-species and multi-substrate shared community. And so so I think this is a situation where
01:11:02
Speaker
the best approach is to combine ah kind of radical transformative long-term goal for where we should go with a kind of moderate incremental short-term set of steps that we take to build momentum in that direction.
01:11:19
Speaker
And so i would want to set as an explicit long-term goal that we build a society where all stakeholders can receive appropriate consideration and respect and compassion in a way that is commensurate with their interests and their needs and their vulnerabilities.
01:11:35
Speaker
ah But Since we currently lack the knowledge, the capacity, the political will necessarily necessary to do anything approaching that right now, because there are grave risks associated with the misuse of AI, with losing control of ai with AI interests swamping human interests before we can properly align them or achieve cooperation. And then for other reasons like pandemics and climate change and global political instability, for all of those reasons, we should regard that long-term goal as a long-term goal.
01:12:11
Speaker
And we should work on making moderate incremental changes to our legal and political systems, starting with the same kind of bare representation for animal and AI interests that we give, for example, to the interests of future generations when making decisions that affect them or to the interests of members of other nations when making decisions that affect them.
01:12:31
Speaker
we can at least include them as stakeholders. We can at least give them a little bit of consideration. We can find at least some positive, some solutions and steer things in a slightly better direction, even if we're not strictly following the numbers and and giving them the vast majority of of the weight right now. I think we could start there with animals and AI systems and then gradually build momentum towards more and more consideration and support for them over time as our resources and capacity allow for that increase.
01:13:00
Speaker
Mm-hmm. Many leaders of AI companies and academics working on AI, many listeners to this this podcast think that we will get something like artificial general intelligence quite soon, perhaps within five or 10 years.
Accelerating Societal Response to AI Advancements
01:13:18
Speaker
what is What is something we can do right now that would that was would put us on a better course? Because...
01:13:28
Speaker
Throughout this conversation, you've you've emphasized the need to to do more research, to understand these these questions at a deeper level. Is there something we can do pragmatically, given the given the kind of insane pace of change and the uncertainty we're under, two to hopefully put us and in a position to look back and say, okay, we we acted well in this situation?
01:13:53
Speaker
Yeah, really good question and and really hard question. And and this is, again, an area where the the predicament for AI welfare is similar to the predicament for AI safety and alignment and and governance in general, where it really is moving very fast. And of course, moral, social, legal, political, economic progress tends to move very slow.
01:14:17
Speaker
and And the fundamental predicament here is the technology is moving way faster than the societal response. And we are trying to figure out how to speed up societal response in a way that may or may not be possible. But I think we stand the best chance of moving the needle in the right direction if we take the kind of systems approach with the kind of division of labor i was describing before.
01:14:43
Speaker
where we are, yes, doing that foundational research that if we do have decades, will allow us to make progress in understanding the fundamental nature of consciousness and and non-human consciousness over the course of decades. and And also that slow, painstaking work to change social, legal, political, economic systems and infrastructures.
01:15:05
Speaker
but But then we can also have that second track that aims at shorter term interventions that could make a little bit of a difference within existing structures within the next year or so.
01:15:18
Speaker
and And this is, again, why in addition to doing some of that foundational research, we have been doing corporate outreach and some initial government outreach and and general public outreach and in particular have been talking with and giving recommendations directly to AI companies that do have a little bit more power to shape the trajectory of of these AI systems. and And as you were rightly saying before, there is a limit to how effective that strategy can be because there are very powerful economic incentives and very powerful political incentives and a really good scientific and ethical argument is going to carry only so much weight in comparison to all of those economic and political forces combined.
01:16:06
Speaker
But i do think it can carry some weight. And so we should keep perspective about how much a good philosophy paper can do by itself. but But we should not be so humble as to think there is zero chance of any kind of effect at all coming from a good philosophy paper or a good philosophy talk or a really productive lunch with some engineers at a company. And so so I think having all of these tracks at the same time and just pushing in as many directions as possible at once with a good division of labor, scientists and philosophers and AI researchers and policymakers is is the best way to go.
01:16:43
Speaker
Concretely, what what should companies like Google DeepMind and OpenAI and Anthropic, what what should they do over the next five years, say? Yes. So what we argue in our report, taking AI welfare seriously, is that AI companies should take three general minimum necessary first steps right now.
01:17:03
Speaker
And then that can pave the way towards further progress. I mentioned one of them earlier, that was step one, acknowledge. And just as a reminder, Step one is to simply acknowledge that AI welfare is a credible, legitimate, serious issue, not an issue for sci-fi or only for the far future, and to to have that reflected in language model outputs as well, because otherwise ai companies will keep ignoring the issue, keep putting it off.
01:17:30
Speaker
and then be caught flat-footed the next time and internal disagreement arises about AI sentience, as happened with Google in 2022, or the next time a societal debate arises, they they want to be taking the issue seriously and be thinking about it in advance of of the next such dispute.
01:17:48
Speaker
the The second step is assess. Start assessing your models for welfare-relevant features, drawing from with appropriate modification frameworks that we already have in place in animal welfare science.
01:18:04
Speaker
So we have a marker method that we can use to make at least rough probability estimates about consciousness and sentience and in non-human animals, despite their different anatomies and their different behaviors and and how alien and unknown they can be.
01:18:20
Speaker
We have a framework that we can use for estimating the probability of consciousness and sentience and then taking precautionary steps to give them the appropriate kind of moral consideration given given the evidence.
01:18:33
Speaker
And we can thoughtfully adapt those frameworks for AI systems and start assessing them in similar ways. And then third and relatedly, prepare, prepare policies and procedures for treating AI systems with an appropriate level of moral concern, given the evidence available in a way that thoughtfully mitigates both the risk of overattribution of moral significance and the risk of under attribution of moral significance. And here too, we have existing templates that we can use as sources of inspiration or cautionary tales.
01:19:07
Speaker
AI companies themselves have ai safety frameworks that they can adapt for this purpose. We also, in in the research context, have IRBs, Institutional Review Boards, that we use for ethical oversight for research on human subjects.
01:19:21
Speaker
We have IACUCs, Institutional Animal Care and Use Committees, that we use for oversight for research on non-human animal subjects. We have citizens assemblies that we can use to consult the general public about generally what type of approach to risk mitigation is appropriate in this context.
01:19:39
Speaker
And so we can draw all of these together in order to create the right kind of framework for treating AI systems with the appropriate level of moral concern. So that is what we think they should do, not just in the next five years, but this year.
01:19:52
Speaker
Acknowledge, assess, and prepare. And then once you do that, and you create a general internal infrastructure, like you hire or appoint an AI welfare officer or researcher, and you build up a little lab or group,
01:20:06
Speaker
within the company and you create those bridges with the people working on safety, this is when you can really then start to get more granular and and take further steps. On the second step of assessing these models,
01:20:22
Speaker
What tools do we have away available there? Are the the theories of consciousness that offer kind of precise measurement, are they applicable to large language models as we see them today? ah Here I'm thinking of something like integrated information theory.
01:20:38
Speaker
that There's at least some kind of measure there that you that you can work with. I'm just wondering with how... how such assessment fits into the kind of engineering pipeline of developing AI, because it would have to be rigorous in ah in ah in in a way that at least many the theories of consciousness aren't yet.
01:20:59
Speaker
Right. Yeah. So in general, we have different sources of evidence, at least potentially. In general, what we use when making probability estimates about non-human consciousness is, as I said a moment ago, called the marker method, sometimes also described as the indicator method.
01:21:14
Speaker
And basically what this involves is you can start by introspecting, looking within to tell the difference between conscious and non-conscious processing in your own experience. So I can tell the difference between when I have a felt experience of pain versus an unfelt nociceptive reaction to noxious stimuli.
01:21:36
Speaker
And you can then look for behavioral and anatomical properties that correspond specifically with conscious processing in humans. And you can then look for broadly analogous behavioral and anatomical properties non-humans.
01:21:51
Speaker
And if you find them, that is not proof of consciousness. It does not establish certainty. Proof and uncertainty are unavailable in the absence of a secure theory of consciousness. But if you find many of those properties together in the same kind of way in a non-human, it can at least count as evidence and it can increase the the probability under uncertainty.
01:22:10
Speaker
So with animals, for example, we can look not only for anatomical structures, like do they have the same brain and body parts that seem to matter for consciousness for humans, but we can also look for behavioral profiles like Do they nurse their own wounds?
01:22:24
Speaker
Do they respond to analgesics and antidepressants in the same ways as humans? Do they make behavioral trade-offs between the avoidance of pain and the pursuit of other valuable goals? With AI systems, we are not presently able to look for those same behavioral profiles because first of all, we lack the anatomical and evolutionary similarities between with AI systems that we have with non-human animals and that allow us to draw inferences from from those behaviors. And we also know many AI systems are specifically designed to mimic human and non-human behaviors. And so we have to take their language outputs with with a healthy pinch of salt.
01:23:01
Speaker
But we might in the future be able to design systems whose behavior can count as better evidence and more thoughtful behavioral tests. There was a really interesting paper based on a behavioral test that that came from some Google researchers and some academics that we can talk about if you like.
01:23:17
Speaker
We should talk about that that paper then. Oh, cool. Yeah. Well, I can i can circle back to that in a second. and And in the meantime, I can just say, in addition to exploring the possibility of better behavioral tests, we can look underneath the potentially misleading behaviors and appearances for underlying architectural and computational features associated with consciousness, according to a range of leading scientific theories of consciousness. So not just integrated information theory, but we can ask, do they have
01:23:49
Speaker
architectural computational ah capacities associated with perception, attention, learning, memory, self-awareness, flexible decision-making, a global workspace that coordinates activity across these modules.
01:24:03
Speaker
Now, when we look for those capacities in existing AI systems like large language models, we tend not to find very sophisticated and integrated versions of these capacities.
01:24:16
Speaker
But we also are not finding any barriers at all, any technical barriers at all towards the creation of AI systems with sophisticated and integrated versions of all of these capacities in the next five or 10 years.
01:24:30
Speaker
And we know that companies and governments are working to develop exactly those kinds of systems. So even if the evidence is low at present, we can expect that it will increase over time.
01:24:41
Speaker
It's interesting, as as you listed off those capabilities of AI systems, i I was thinking to myself, well, it seems to me that current systems have all of those capabilities, like perception, for example, when I give an image input to a model and ask it about it,
01:25:01
Speaker
it can explain what's what's in an image that's been available for for a decade now. self-reflection when you have a reasoning model, reasoning about its own output and and making a plan for responding appropriately. That seems like some form of self-awareness or at least meta thinking.
Assessing Consciousness-Related Capacities in AI Systems
01:25:21
Speaker
ah Many of those traits just sounds to me like they're in-present models, but is there is there something deeper here where you have operationalized these capabilities in a way where when you measure, they they aren't really there?
01:25:37
Speaker
Yeah, so I should say, first of all, that the the people who have primarily driven the the scientific investigation into evidence for consciousness in in current large language models are Patrick Butlin and and Rob Long. and And they released a great 2023 report that investigated evidence for consciousness in large language models. and And they were the ones who drew the conclusion that I summarized a moment ago, though Rob and Patrick both wrote the 2024 report with me that I was discussing, and and we emphasize all of those points in in this report.
01:26:10
Speaker
Now, to answer your question, I wanted to give them credit, but to answer your question, Yes, a lot here depends on how we operationalize these capacities and how fine-grained versus coarse-grained our specification of these capacities and, relatedly, how much we anchor on the specific forms these capacities take in humans and other animals and look exclusively for those forms in the AI systems.
01:26:35
Speaker
And on the one hand, it might seem... responsible and conservative and cautious to set a relatively high bar and look for not any old form of perception, attention, learning, memory that could be really cheap and and easy to trivially easy to produce.
01:26:56
Speaker
But no, look for a a quite advanced, sophisticated form of perception, attention, learning, memory of a kind that plausibly could carry the moral weight that that we associate with with sentience and and agency.
01:27:11
Speaker
But on the other hand, it would be a mistake to be so anthropocentric as to anchor exclusively to one of millions kind of brain and mind and look for exactly that kind of brain and mind, given the the reality that consciousness, sentience, agency, moral significance, they they very well could be multiply realizable, realizable in other materials, with other structures, with other functions.
01:27:38
Speaker
And so a difficult... methodological question, and it requires engaging with both the the scientific and ethical dimensions of how to strike a balance between erring too far in one direction and erring too far in the other direction.
01:27:52
Speaker
A tough question is, how fine-grained versus coarse-grained should we set the sort of default specifications of these capacities when we look for them in AI systems? So when we look for relatively sophisticated versions of these capacities, we are not finding advanced and integrated versions of them in current models, though though we can find simpler versions of them and in in current models.
01:28:15
Speaker
And we can expect that those more advanced and integrated versions could exist in in other near future models. Yeah. Let's ah talk about the Google paper you mentioned.
01:28:27
Speaker
Sure. So I have no idea if this was officially institutionally sponsored by Google, but but there there were a ah couple of Google researchers involved as well as some academics. and And so to that extent, we can we can call it the Google paper.
01:28:41
Speaker
So this is a paper that came out, I believe, in fall 2024. And it four and it adapted ah behavioral trade-offs test that we have used to investigate evidence for consciousness in non-human animals for AI systems.
01:28:57
Speaker
And so so the question here is, When you present a non-human subject with two different types of incentives, like the incentive to avoid pain and then the incentive to pursue some other valuable goal, like in the case of an animal, it might be securing a good shell or securing some good food.
01:29:18
Speaker
How do they navigate trade-offs between those incentives? And do they navigate trade-offs in such a way that suggests they have a kind of common currency and ability to make principled or consistent decisions about those those trade-offs?
01:29:35
Speaker
If so, that could be indirect evidence of at least a limited global workspace, at least a limited ability to integrate information that comes from different modules into a general information processing system.
01:29:49
Speaker
right And so when we when we look for evidence of behavioral trade-offs in non-human animals, we do often find it, and that can tick up the probability that consciousness is present in many vertebrates, many invertebrates, like cephalopod mollusks and decapod crustaceans.
01:30:04
Speaker
Now, with AI systems, the study worked a little bit different. Of course, we are not engaging with robots in the real world yet when conducting this research.
01:30:16
Speaker
So instead, roughly, the researchers stipulated to the AI systems that there will be this amount of pleasure or pain associated with this decision.
01:30:27
Speaker
and And then, of course, they also had separate goals that they were meant to be pursuing. And then the researchers investigated to what extent the AI systems prioritized the goals that they were meant to be pursuing or prioritized the amounts of pleasure and pain that were stipulated to follow from from those decisions.
01:30:46
Speaker
Now, nobody is under the illusion that these stipulated pleasures and pains were actual pleasures and pains for the AI system. But it is interesting to use a similar kind of test to explore whether the AI system, like a non-human animal, is capable of making seemingly principled or consistent trade-offs between the goal and the stipulated pleasure and pain.
01:31:11
Speaker
Because as with non-human animals, even if that might not be direct evidence of pleasure and pain, It could be indirect evidence of a kind of integration of information or a kind of global workspace that can bring everything together for a unified decision.
01:31:25
Speaker
And then for that kind of theory of consciousness, that could be some evidence that some of the conditions for consciousness are present. And so I mentioned that paper because that strikes me as an interesting direction to explore for behavioral research. We can still do that architectural computational research to see whether they have the underlying material capacities for perception, attention, and so forth.
01:31:50
Speaker
But this strikes me as a way of testing for behaviors that can count as at least weak indirect evidence of the presence of certain relevant capacities. And that avoids the pitfalls of kind of naive behavioral tests where you just ask them, are you conscious? And and you take their answer as evidence in one way or the other.
01:32:11
Speaker
yeah As a final question here, for listeners who are interested in learning more about AI ai welfare, perhaps contributing to to AI welfare, the debate around it or the research around it, what's the best place to start?
01:32:27
Speaker
Well, there are lot of people entering the space now, and and there are some groups that are working on it. So obviously, there is the center that I direct here at NYU, the Center for Mind, Ethics, and Policy.
01:32:41
Speaker
And you can find information about our papers, about our events, and other activities on our website. Sign up for our mailing list. We we do regularly put out collaborative research. We host public events. We host networking summits and and support early career researchers. so You can check out those opportunities.
01:32:58
Speaker
There is also Elios AI Research, E-L-E-O-S. This is a nonprofit organization with Robert Long and Patrick Butlin, Kathleen Finlinson,
01:33:11
Speaker
and and others. And it is working on developing assessment tools that AI companies can use to better understand risks associated with consciousness and sentience and agency and moral significance.
01:33:26
Speaker
And they they often work with us on the research. And then there are people at all sorts of different universities who are entering the space. So I would encourage you to follow those groups in particular. And then if you do have interest in the issue, then this is an area where the the research field is at such an early stage And it touches so many disciplines and so many issues that no matter where your background and expertise is, you probably have something to contribute.
01:33:54
Speaker
We need philosophers and other humanists. We need sociologists and other social scientists. We need cognitive scientists, computer scientists, other natural scientists. We need people with expertise in law and policy.
01:34:08
Speaker
We need people with expertise in communications. so So no matter where your expertise is, get in touch with us or with Elios or with others in the space. and And it would be great to just hear what you might be able to contribute.
01:34:22
Speaker
Fantastic. Thanks for chatting with me, Jeff. Yeah, thanks so much. It was a great conversation.