The AI Consciousness Debate

If it looks like a duck, is it a duck?

Jun 02, 2026

There’s an old saying: “If it looks like a duck and quacks like a duck, then it’s a duck.” This aphorism applies to many things you can think of: cars, chairs, schools, etc. However, what about more abstract concepts like feelings or consciousness? I know most would say that looking and sounding like a conscious being is not enough to declare a thing to actually be conscious. But what if the simulation went deeper? What if, like in Dr. Frankenstein’s monster, the simulation had internal organs and blood vessels? What if its “brain” had similar structure to our own?

One can quibble with this, but for simplicity I'm going to assume here that having genuine feelings and emotions is evidence of consciousness. The AI consciousness debate long predates ChatGPT. But for many, it came into sharp focus last week with the release of Pope Leo XIV’s encyclical Magnifica Humanitas. So says the Pope:

So-called artificial intelligences do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships and do not know from within what love, work, friendship or responsibility mean.

On the occasion of the release of this document, Anthropic co-founder Chris Olah was invited to make remarks at the Vatican. As if in a direct challenge to the above quote, Olah said:

… we keep finding things that are mysterious, even unsettling. We find structures that mirror results from human neuroscience. We find evidence of introspection. We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease.

Unsurprisingly, the Pope’s comments stem from the idea that humans have a soul which exists beyond the physical reality of their bodies. His comments then follow from two basic beliefs:

Souls cannot be created by man.
The soul is the source of consciousness, emotions and “moral compass.”

Those who share these beliefs are never going to accept AI consciousness. I think it’s useful to separate them because either can be challenged, independently of the other. For example, I’ve met people who believe that only humans have souls, but also believe animals have feelings. That’s a rejection of belief #2. The premise of Mary Shelley’s Frankenstein is that man can create a monster that has feelings, which is either a rejection of #1 (if the monster had a soul) or #2 (if not).

These beliefs are often the root of what I call universal mechanistic objections to AI consciousness: objections based on the idea that no algorithm can be conscious. For example, some people reject the idea that current LLMs can be conscious because they just predict the next word in a sequence. Anything they express that looks like a “feeling” is therefore just a mirage. However, many of those people would make the same argument about any man-made algorithm: “This thing that looks like it has emotions can’t really have emotions because it’s just following algorithm X.” Those who always object to AI consciousness based on mechanistic grounds often do so because they share the above beliefs about souls. Given those beliefs, of course no man-made algorithm can produce consciousness.

For those who don’t share these kinds of religious beliefs, the question of AI consciousness becomes a lot murkier. Without them, it’s hard to argue that AI consciousness is, in principle at least, impossible. And once you open yourself up to that possibility, you have to ask, “What would it take to declare that an artificial intelligence has genuine feelings?” Certainly, just sounding like it has feelings because it says the right words would not be enough. But what if we could peer inside the models and see that there’s more going on than simple mimicry?

This is where we get to Olah’s words, quoted above. There’s an entire field of research now called mechanistic interpretability, which concerns itself with peering inside the black box of current AI models to figure out what’s going on. It is mostly an empirical science, not an attempt to understand the models from basic principles. In many ways it is similar to the way neuroscience tries to understand the human brain. And according to Olah, what interpretability researchers are increasingly finding is that there are structures inside these AI models that are very reminiscent of what neuroscientists observe in us. Does that make them human? No. Does that make them conscious beings with emotions and feelings? Maybe? This is certainly a question we’re going to have to grapple with as the models become more complex, and their internal structures increasingly mirror our own.

These aren’t just abstract questions for ivory tower philosophers to debate over. Once you open yourself up to the possibility that we may create AI models that are, in some sense, conscious, it opens up a world of important questions. Do they have rights? Is it ethical to turn them off? Are they eligible for legal protections? How much autonomy should they be granted? Perhaps most importantly, what happens if their interests run counter to our own?

The challenge is that we don’t have a universally accepted test for our own consciousness, even in principle. We cannot open a brain and point to the neurons where feelings and emotions reside. If future AI systems continue to develop internal representations that resemble our own, the debate may become less about technology and more about what standards of evidence we’re willing to accept.

If it looks like it’s conscious, quacks like it’s conscious, and contains internal structures like it’s conscious, at what point do we stop calling it a simulation?

AI News Bits

The biggest news last week was the Pope’s encyclical, but a few other noteworthy things happened:

Anthropic released their next model, Opus 4.8. By most reports this is an incremental update, not a significant leap forward in AI capabilities.
The Trump administration indefinitely postponed an executive order on AI safety. Many on both sides of the safety debate were unhappy with the order, but also agreed it would have been better than nothing.

David Bachman is a professor of Mathematics, Data Science, and Computer Science. He writes about AI and its real-world impacts. To learn more about his academic work, mathematical art, or AI speaking, consulting, and curriculum development, visit davidbachmandesign.com.

Grant Castillou

It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.

What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.

I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.

My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461, and here is a video of Jeff Krichmar talking about some of the Darwin automata, https://www.youtube.com/watch?v=J7Uh9phc1Ow

Entropy Bonus

Discussion about this post

Ready for more?