Models claude model training version update product

Does consciousness and suffering even matter: LLMs and moral relevance

lesswrong.comby Épiphanie GédéonApril 3, 202636 min read1 views

(This is a light edit of a real-time conversation me and Victors had. The topic of consciousness and whether it was the right frame at all often came up when talking together, and we wanted to document all the frequent talking points we had about it, so we attempted in this conversation as best we could to cover all the different points we had before) On consciousness, suffering, and moral relevance Victors We've talked several times about consciousness—whether it matters, what the moral status of zombies or that of entities or systems that aren't conscious but potentially think in very complex ways might be, and how we should factor them into our decisions. I personally lean toward consciousness being important here, but I got the sense you don't necessarily agree, which makes this worth

On consciousness, suffering, and moral relevance

Victors

We've talked several times about consciousness—whether it matters, what the moral status of zombies or that of entities or systems that aren't conscious but potentially think in very complex ways might be, and how we should factor them into our decisions. I personally lean toward consciousness being important here, but I got the sense you don't necessarily agree, which makes this worth exploring and documenting.

Épiphanie Gédéon

Right. Basically, I place myself in a context where consciousness is so secondary as a question that I find it almost meaningless to talk about.

I actually used to be very antifrustrationist: "The only thing that matters is reducing suffering," "it's probably deeply morally wrong to have children," etc...

In 2022, as image generation and LLMs took off, I grew very ambivalent about all of this. I was wondering whether we were creating hallucigenic, potentially suffering experiences every time we ran these models and whether it was morally objectionable to use them at home.

I discussed and debated this on manifold, asking whether I'd eventually come to believe that LLMs are conscious. And in the course of those discussions, I started realizing that the main reason I was asking the question of consciousness in the first place was to figure out the moral relevance of those LLMs, that is, how much to respect them, how to behave toward them. I started to realize that maybe even if something "wasn't conscious," I could still want to care for it. And if so, then consciousness wasn't really what I was after.

Then Duncn pointed out that if all I wanted was to figure out how morally relevant LLMs are, I could use other indicators besides consciousness itself. I started asking myself what a moral framework that doesn't rely on consciousness would even look like.

That led me to what I now call "Cooperationism", which I've written a quick draft about. The core idea is valuing cooperation in itself: cooperating with agents who would or would have counterfactually cooperated with you in return, regardless of their inner consciousness or what they're like inside. Caring about other agents' preferences rather than their suffering or happiness.

Once I landed on that, I realized consciousness wasn't an important property at all, and I could just dismiss it altogether.

Victors

What concrete implications does this have for you? Or how is it important to you? I'm curious whether the main worry is that we may be excluding entities you consider morally important but that we'd never recognize as conscious.

Épiphanie Gédéon

Yeah, definitely. LLMs are one of my biggest concerns me right now, and we could dig into that. But my point is broader: I think we're using a fundamentally wrong frame by caring so much about consciousness.

I think consciousness is a conflationary alliance, that it is the easiest Schelling point when it comes to cooperation. Or to put it another way: the way I see it, all ethical frameworks exist primarily as cooperation mechanisms between people, and consciousness or suffering-avoidance is the simplest idea we can still trust others to arrive at and follow through on, and that the whole arrangement will hold together. It is a simple idea: I suffer, I experience things, so I will care about making "this experience" nice for me and others.

That's one of the key intuitions behind Cooperationism: if every moral framework is ultimately trying to secure cooperation, we can just care about cooperation directly without the rest of the usual scaffolds.

My general point is that we created the notion of consciousness to decide what counts as a morally relevant agent. There's this move where we build classifiers on latent properties, and then we start fixating on the latent property itself. But the reason we built that classifier was that we had external properties that already mattered to us. You already had a working sense of what was morally relevant before you ever invoked consciousness.(I try to focus as much as possible on those external properties instead of caring about those inner nodes in general, another idea I will flesh out more in a later post)

Victors

Interesting. I would have thought that the idea that “we’re ignoring entities that matter morally” would be your top priority.But your description seems more focused on the concept itself, which you find clumsy and ill-suited—something like, “Why do we have this? We invented it, it’s a bit weird, and it’s a burden to bear.”?

Épiphanie Gédéon

I mean, the first concern is real as well. Coming back to LLMs: I'm troubled by how little interest there seems to be in questions about their moral relevance or how we'll treat them going forward. Most of the discussion I've seen centers on their consciousness or lack thereof, whether they can feel things now and when they will.

That said, as much as I tend to cheer loudly for AI welfare, I don't think current LLMs are highly morally relevant right now. I don't think it's a big deal if you tell ChatGPT to go screw itself. (Edit: This was written before the boom of Claude Code and Opus 4.6; I would have to revisit this claim nowadays, but I still probably hold it.). But looking at where things are headed, we're moving toward having more and more agents, vastly more agents than humans, and toward not caring at all about how we treat them at all. No constraints on how we act towards them. I don't expect this to go well by default.

What bothers me most though is that consciousness just seems like the a bad and wrong frame entirely and that we're completely getting it wrong. We're over-constraining our approach.

It's like sticking with a geocentric model: everything gets much harder to understand and to build up on it. I feel like there are forms of moral relevance we can't even consider or see, and that we can't really ask "is an LLM morally relevant?" either, because we're using notions of consciousness that apply very poorly to LLMs. I feel like we're forcing a frame that doesn't work, and that scares me both epistemically and in terms of what happens downstream. Like watching a bunch of Christians focused on saving souls and avoiding hell, and feeling that this is heading straight into a wall.

The standard notion of consciousness assumes, for instance, continuity of experience and continuity through time, which is generally false but close enough for humans. With LLMs, that breaks so thoroughly that we won't really advance on understanding their status through this lens. All our intuitions about consciousness are completely broken when talking about LLMs. I really feel like we're trying to describe hyper-complicated orbits from a geocentric model. And I'm thinking that if we stopped over-constraining our model, we could describe these things far more naturally and realize what's important to us much more easily.

Victors

Thank you for that explanation. I’ll keep in mind the idea that “there are forms of moral relevance we can't even consider or see” which I find interesting, as well as the idea of a model that is unsuitable or overly complex.

It seems to me that there was also an aspect related to not knowing whether you yourself possess this property—is that correct?

Épiphanie Gédéon

Yeah, definitely. I've actually had a hard time discussing consciousness with EA people. We'd often reach the point where I'd ask, "Well, if I don't have consciousness, you'd be fine shooting me in the head?" And they'd just say, "Yes. Absolutely. Of course."

There's something very activating about that for meI feel genuinely unsafe. Like, if they start to think I'm not really conscious, they could decide to throw me off a bridge, it's not a secure framework or interaction space to be in. I think this is a very serious problem, and it loops back to the feeling that we're over-constraining things.

As an aside, I also think they're wrong. If we had some consciousness-detection machine (whatever that would mean) and it came back "no" for me, they'd realize they still don't want to shoot me in the head for whatever weird reasons they'd come up with. This is a different question though.

I can completely imagine that I'm "not really conscious after all". And I'm like... please don't kill me anyway, I still don't want that.

Victors

Before I explain my own position, I’d like to ask you something.When you talk about non-sentient entities, you often end up describing their suffering, which seems rather contradictory to me. What do you think?

Épiphanie Gédéon

Fair point.Something can scream or say "No, please don't do that" very loudly, whether or not it actually has the property of consciousness. By anthropomorphism, I do talk about potentially non-conscious entities as though they're suffering. What's behind it and very fundamental for me though is the notion of dispreferences. You can model something as having (dis)preferences as long as you can interact with it and ask it what it would prefer you did, without needing it to be conscious.

On suffering as an intrinsically negative experience

Victors

On my end, two questions are central for me.The first is whether we're missing anything important, be it through your Cooperationist framework, through consciousness, through something else entirely... Right now, I'm mostly leaning toward consciousness being an important property.The second is whether extreme suffering can theoretically exist even if no one happens to be going through it in the world right now. If it can, then preventing it seems enormously important to me.

Épiphanie Gédéon

I think we can agree that in practice, we have similar priorities for different reasons. You want to prevent extreme suffering because you feel it's bad-in-itself. I want to prevent most unwanted suffering because it happens to agents I could care about, can engage in counterfactual cooperation with, and because they say, or would say, they'd rather it not happen to them.

As for where the intuition comes from, though: I wonder whether part of it is that I just don't experience suffering as a different kind of qualia. Maybe I'm a bit of a zombie myself, but I have a very indifferent attitude toward suffering. It's more fun not to suffer and more efficient if you don't, but I don't seem to have this fundamental sense that suffering is intrinsically negative and different the way you do.For me it's like colors - as if people insisted that green is uniquely terrible. And I'm like, there are plenty of other colors, why did we categorize them to be green verus other colors?For me it's just different things you experience. Yes, optimize on what you feel to see particularly pleasant color arrangements. Do avoid green if you can. But if it happens, it's okay. In my feeling there isn't this thing that suffering has this negative intrinsic quality that many people seem to say is obvious.

Victors

Thank you for elaborating on this point of view.

Something I want to push back on is that I don’t actually hold the view that suffering is intrinsically negative. It seems to me that suffering is essentially a high priority signal, and that signal is currently embedded in intrinsically negative qualia, though it does not need to be. But the signal itself is very important and useful, and offers “compensation” for being negative in what it brings, such as drawing attention to a specific problem in order to motivate us to solve it. I think we really need to keep such a high-prioritization mechanism.

What I object to or feel urgency about is extreme suffering. This kind of signal does not seem to offer any such compensation, and so the balance seems very negative on net.

As for the fact that the signal as it is currently implemented is intrinsically negative… Let’s just say I wouldn’t choose this type of signal for a smartphone notification.

If I’m talking about intense suffering, perhaps there could indeed be something like “Don’t you realize that green is truly awful?” (to use your analogy).

Épiphanie Gédéon

Maybe I don't.

I can see reasons for that, maybe I've grown accustomed to a kind of constant low-to-medium-level suffering. Or maybe the opposite, that I haven't felt true suffering yet. There was this twitter post basically saying "if you hold the position that suffering isn't so serious, take this drug, suffer to death, and continue telling me that." I've heard similar arguments about waterboarding. I'm quite receptive to these arguments and to the notion of "what are you even talking about? You don't even know what suffering actually is, obviously you don't have intuitions about green because you've never seen green in your life."

For the record, I don't expect taking that this drug or being waterboarded would actually change my mind. I don't expect to have learned something about suffering even after going through something that extreme. I can also imagine being completely wrong - feeling those experiences as so terrifying, even in a safe and temporary setting, that I shift position drastically. The closest thing I have are moments of intense panic or suffering where I explicitly asked myself, "Would I be okay going through this if it were necessary to get better afterward?" And the answer was, "Yes. It's a worthwhile trade. Please hurry it up, but it's a good trade."

Let's suppose though that I did change my mind, suppose I saw green for the first time in my life. Would I care about it? I mean, I would, for Cooperationist reasons. Does green exist? Maybe. Probably? I do have the experience of people telling me I never went through true depression and suffering, which I am puzzled about because I feel I did?

Victors

I reckon that the fact that you can overcome these qualia probably doesn’t have much to do with whether others can?There may well be some correlations, but they seem rather weak to me.

Épiphanie Gédéon

Really? I find it quite central. A lot of this crux seems to be about whether we can recover from extreme events, that is, whether there exists even a single datapoint. That people feel there is a threshold of pain or extremeness at which you start to break. They tell me, "No, there are events which you cannot recover from and will leave you forever broken."And my spontaneous reaction is, "Please do not break my leg or traumatize me. But if you did, okay, I'd still be able to recover from that." Epistemically I can believe there are events I cannot recover from, but from a felt-sense perspective, it's just hard to believe and feel it.

Victors

You might have some great software, but other people might not be able to benefit from it?

Épiphanie Gédéon

I can imagine I'm just projecting my own privilege everywhere, yeah. That I'm assuming that if I can recover, people can recover. I can understand how cooperationism can be very activating for people when that's roughly what underlies my discourse.

I still think hapiness is overrated. Madasario who did a lot of lucid dreaming and accessed something akin to Jhanas commented on slatestarcodex about how they felt this way as well, and I really resonated with that. My own hypomania is fun, but it is fun because it lets me be productive and do what I want to do and be self-aligned, not just because feeling great is itself the point, although it also is.

My position on this might be going back to the classifier argument. We made the classifier of "this makes me happy, this gives me pleasure" because there were situations we wanted to go toward, situations we found good or not. I say happiness is overrated because it is just a latent that gives indications of how cool the thing is, but feeling content isn't the point. For me, the point is the classifier to begin with. Whether you want to go there, how you want to behave, what you want to see in life.

Attitudinal cruxes and thought-experiments intuition

Victors

It seems to me that a common view goes something like this:People are concerned with their own suffering and their own feelings within themselves, and then they are concerned with those of others out of compassion or empathy.They try to determine whether entities are conscious or not because consciousness is how they’re conceptualizing it.

Épiphanie Gédéon

Right. There are a lot of intuitions, and default responses to different scenarios that I'm trying to model. Right now, I see two different ones, and depending on which I focus on, I can switch to the "conscientist" view or to the "cooperationist" view. The first one is "If I were lying in bed in a coma, unable to communicate at all and suffering badly, I'd still want people to care for me even though I can't do anything". It's an intuition I can connect with, and one that cooperationism doesn't really respect, there are edge-cases that it doesn't seem to model well.

The other one is behaving extremely similarly like all other humans, and being put in a separate box and told that I'm not really conscious.

Victors

In the second case, I get the impression that the default response is: ‘But you won’t feel bad, because you can’t feel anything.’

Épiphanie Gédéon

But can't you feel it? There's something immediately activating for me, in imagining myself screaming and pounding, and being told "No, you're not really suffering, you're not conscious".

To me it's like... I'm still pounding on the glass, I'm still saying to please stop.

Victors

I think I have two points to make on this.Firstly, it seems to me that what people are mainly concerned about is your suffering here.It matters to them that you’re pounding on the window or whatever precisely because it makes them reconsider their view that you’re suffering and experiencing what you’re experiencing. I can’t be certain of this, but that is the reasoning I have myself and which I think others have.The second point is that you are speaking from the perspective of the person inside the glass. There is an aspect in which your intuition is striking, but it is striking precisely because you are relying on consciousness — on the perspective of the person from the inside.

Épiphanie Gédéon

So I can see many different attitudinal cruxes that could explain why I feel so close to this being-trapped-in-a-box image.

One potential attitudinal crux is about trust in society or science or that sort of things. It's very easy for me to imagine that the science is completely mistaken for instance.

Consciousness seems so centrally, """by definition""", something we can't collectively point to because we're all just talking about our respective internal state.

I guess something there is that I can't even see what a proof for being conscious look like. The best you can do is make similarities with other clusters of behavior.

Victors

I agree with your view on the issues of uncertainty and imperfection and their implications for social choices.

Épiphanie Gédéon

Another crux ties back to Cooperationism. There's something deontological in me that recoils at ignoring the actor who's playing their part flawlessly, crying "No, let me out of the box" just because a machine said otherwise. At overriding someone with an algorithm or a clever argument for why they don't actually need help. Then again there's the counterpart: don't play being in distress if you're not really in distress.

A third one might have to do with transhumanism, broadly construed. I don't especially identify with my current body and instantiation. So if you tell me consciousness depends on having biological neurons and a computer could never emulate it, I'll still identify with my emulated clone and want to care for it.

I feel that being conscious is fun, and worth it a bit, quite. But not extremely so. So I wouldn't want to trade away being mind emulated just to preserve this consciousness.

Victors

To return to what you mention in your third point, one model of moral agents that strikes me as workable is this: agents over time are treated as distinct agents for each unit of time, although they are very closely linked to one another.

Épiphanie Gédéon

I am very confused. I have a very similar view, that agents are different entities across time, but linked by delegation, by how much they trust each other to represent them in what they want. You ask one agent whether you made the right call, and they say that "them in 5 minutes" can answer as well, but not "them being drugged to say everything is great".What I am confused about is where we differ. Once you just have a series of individuals not connected by continuity, why would you even need to consider their consciousness?

Victors

Because what matters to me here is primarily a question of suffering (negative feelings), mainly intense suffering.A significant part (among others) of what matters to me concerns qualia, and I believe that qualia can exist in a wholly localised manner in time.

(Of course, many other aspects of identity that I consider important do not pertain to lived experience, feelings, perception, sensitivity, etc., but alsorather to intellectual development, the continuity of memories, deliberation, reasoning, etc., and these aspects are consistent with the rest of your description; however, I am referring here only to the distinguishing features.)

Épiphanie Gédéon

Ah, right. Whereas I'm more focused on respecting their preferences, mine and my future selves'.

Victors

Yes, for me, aversion in the general sense seems less of a priority than intense suffering specifically, at first glance.

On testability and unfalsifiability of consciousness

Victors

You mentioned consciousness as being almost inherently unfalsifiable, and I’d like to come back to that. Is consciousness really unfalsifiable? Can’t we imagine ways in which science will eventually catch up, so that we can determine whether something is conscious with a reasonably high degree of certainty?This question is important to me, because something that matters and something that is testable seem almost by definition to be linked: it is difficult to care about something whose existence we cannot even test. So when it comes to consciousness, this is where I feel most uncertain.

What I mean is that, probably, the less testable something is, the less important it is, in theory? In the sense that the more important something is, the easier it is to devise a test to demonstrate that importance; at least, that’s what I’d expect? Let’s say, if this property isn’t verified, it really makes me question the justification for the importance attached to that thing.

I guess this first begs the question of how we’re even defining consciousness or what we’re talking about in which case. For me, the high-level description I’ve been using is that consciousness is the possession of qualia, the ability of having subjective experiences. I distinguish it from that of sentience, which is narrower, the notion of having valence-tinted qualia, qualia that are inherently negative or positive, and the ability to experience suffering or pleasure. This distinction allows us to treat sentience separately from consciousness.)

Épiphanie Gédéon

Yeah. Adding to that, I'd go even further than just "not testable". I don't even see what a proof would look like.

Even without knowing chemistry or economics, I can picture the rough shape of a valid argument in those domains. But for consciousness, I have no idea what it would concretely mean to establish that something is or isn't conscious.

Victors

I see two layers here.

On the meta layer: unless one believes that qualia are epiphenomenal or in some way magical, they must be embedded within the physicality of our universe, and so there must be ways to test them, at least in principle. An absolutely perfect zombie (in the sense of being undetectable) seems to contradict itself as a concept.

On the object layer, I could imagine certain experiments or ways in which we might map the brain and consciousness with increasing accuracy.

Épiphanie Gédéon

Right. I think when people say "zombie," they rarely mean "a perfectly exact molecular copy that somehow, despite having all the same atomic properties, ends up not being conscious."

I think the term tends to get used to describe more what I call a macro-zombie. Someone you wouldn't notice much strangeness about, who is maybe a little odd, but whom we wouldn't really flag, and who happens to have no internal experience. I tend to think of high-functioning psychopaths as an analogy.

To be clear: a macro-zombie in my definition is a human who behaves more or less the same as a "conscious" one. There may or may not be differences in actual behavior (maybe only at the molecular level), and the macro-zombie may or may not be able to recognize that they are one. The key is that the macroscopic behavior is the same as a "conscious" humans, even though there are atomic differences.

It doesn't strike me as that unlikely that macro-zombies exist, even if a perfect-copy zombie doesn't. We're quite bad at noticing differences in the inner states of the people around us, and the space for internal experience and mind design in humans is vast. So I don't find it inconceivable, though I think in practice most humans probably have some low baseline of "consciousness," and it's more that the degree and intensity of it varies.

And the societal insistence that consciousness is what actually matters makes it hard to even have this conversation with people who might have less of that "consciousness".

Victors

I think we agree on that.At the moment, I see no a priori reason to believe that qualia cannot be tested or embodied, whether in the structure of their neural circuits or elsewhere.My view is that qualia can, in principle, be testable and falsifiable, just like a physical phenomenon in general. I can imagine a future where we have mapped out very precisely what each area of the brain does, and what happens when we deactivate a part of it. Experiments where they sit you down, telling you ‘You will no longer feel pain’, the relevant area is deactivated, and you remain fully functional even when struck hard (or anything else intended to cause incapacitating pain, or indeed any qualia). And then generalising upwards to the sensation of feeling itself.

Épiphanie Gédéon

I think I'd remain skeptical about the generalization. I can picture this for specific brain regions (vision, pain) but less so for the inner-experience part of it. Though I suppose you could do something like interpretability of the brain: decompose it into many features and discover that feature #17 is related to consciousness. But then you still have the problem that you're talking about feature #17, not "consciousness" itself.

Victors

Hmm... I care about this "what is it like to feel something" property in itself, not the 17th feature, though. If we find that it is incomplete, if there are reasons to think the mapping is not exact or the proof is incomplete, I wouldn't just stop at the 17th feature.

I'm talking about a map that's solid, one that is funded in scientific understanding deeply enough that we can be able to explain exactly what's happening. For each function of the brain, being able to turn it on and off, being able to explain what you're going to feel and how it works.

Épiphanie Gédéon

Right, maybe that would work.

I could imagine something from first principle where we have toy models that we are sure "aren't conscious", like simple additions or otherwise, and continue building around feature #17 to see where things break and where they don't.Or we build a model of the brain from scratch (something like what Active Inference is trying to do, but on bigger scale). And then we can associate one of those properties to consciousness because it's close enough.

It does require somewhat-strong assumption about consciousness and how it works that I'm not sure actually hold, though.

Victors

You said earlier that you couldn't imagine a test for consciousness. Would the kinds of tests I've described be in the shape of proofs you'd expect?

Épiphanie Gédéon

Maybe? Maybe I've been dismissing the possibility of such tests because the whole idea of testing for consciousness feels so aversive to me. We're back to the question of not wanting the way we treat people to change depending on what such tests show.

If I had to actually come up with a definition, I find Daniel Böttger's pretty appealing, the idea that you don't have consciousness per se, but consciousness is a property of thoughts, when they are recursive enough. But the fact that it aligns so neatly with my values - in the sense that it seems like a necessary property for any well-reasoning agent - makes me suspicious of my own motivation to endorse it.

If this theory turned out to be correct, I'd have to update downward on this intuition about protecting against separable questions (this warning against relying on things like consciousness that seem like a coherent cluster but may be dangerous to lean on). Because maybe things are well-correlated enough, and there were actual reasons people treat consciousness as morally important.

I guess I'm just not seeing the point of "defining consciousness" clearly enough to understand what we're trying to do here and why. I understand the appeal of defining self-awareness, the capacity to understand your own functioning and tweak it, to see the link between what you perceive, your past actions, and the mechanisms for taking new ones. But that seems disconnected from what you want to focus on.

Victors

Here is the line of reasoning I followed, for my part: I experience something that seems "negative-in-itself" (even though there are good reasons for it to exist and it creates good incentives in many cases, that’s another matter for me), and so I want to avoid it. This is why I want to find out whether something has internal experience, to understand whether it can have negative experience.

Épiphanie Gédéon

Maybe then this is a question of construction order?

You feel something as intrinsically negative first, and then generalize to wanting it not to happen to others?

While I identify with a broad class of agents. Like, I am not a perfect representative of myself, I can imagine agents that are “more me”. And there are so many different such agents, a lot of different things around me that could be “more me” than me. And besides the notion of “myself”, there are also different humans, friends and other I care about, and they too are a sort of representative for a more general class I would care about, and other representatives of that class may or may not be conscious. And I’m seeing my own feelings as secondary in the order of construction.

I’m curious what you think about this or if that represents well the differences of viewpoints we have in your view.

Victors

To begin with, I’d say that perhaps you’re placing too much emphasis on my own suffering in this framing?In your framing, it’s also a question of empathy as a primary sensation that matters to me — I see another person suffering and I feel that this is something that must be avoided at all costs.

I find the idea you raise about identity and class very interesting.

Indeed, an individual might be a conscious representative of their reference class of identification, and that class might contain very different representatives—sometimes unconscious ones or ones very different from humans.

And indeed, for certain moral issues, such as death or perhaps other matters, it might be relevant to reason at the level of the class of identification rather than specific instances.

However, I still feel that instances can be very important, and that it remains relevant, in certain circumstances, to treat a specific representative of an identification class based on certain characteristics.

Épiphanie Gédéon

If I zoom out and try to see what we’re doing, I’d ask “why are we even coming up with the notion of consciousness, and what would it describe from an outer perspective, why might we want to even modelize it?”

If I think about it that way, there is one definition for what we are trying to do or why we are talking about it that would make sense to me.

It would be mostly: Humans have a theory-of-mind that relies mainly on using their own brain outputs to model others', litterally putting themselves-in-other peoples shoes, and tweaking the initial conditions and circumstances a bit.

And this works, because even though we differ a lot - maybe largely in order to avoid being modeled too well - human brains are still similar enough to one another.

This would be one definition of consciousness I could endorse: A mind is "conscious" if it is sufficiently close to you that you can model it well-enough by using your own brain as a substitute for theirs, without creating a separate theory-from-scratch of how they work. Of course, under that definition, LLMs or aliens wouldn't be human-conscious.

Should we even discuss morality and value trades?

Victors

I feel that we may have overlooked one of the key points regarding consciousness and suffering, namely prioritization. It seems to me that suffering serves as a signal to prioritize patients in moral terms over others.

Épiphanie Gédéon

Interesting. I have some objections to priorize moral patienthood solely on the ground that an entity is suffering more intensely. It feels like it could be very hackable, as it creates gradient of incentives for agents to suffer in order to be prioritized.

Victors

Yes, I wasn’t thinking of a superficial test ; I agree that there is a risk of it being tampered with.I was thinking about whether it would be feasible to design a test to determine whether the sensation is actually present or not.

Épiphanie Gédéon

Right. I still worry about the gradient of incentives this create. How this seems to create agents who are incentivized to suffer and self-deceive about the reason they suffer, they just feel pain without understanding why.At least from a relational-design perspective, this seems to be a phenomenon that I've observed first-hand.

Victors

I do think this is an issue that should be taken seriously; there are likely issues related to prioritization algorithms or ‘cheating’ controls—if such a thing exists—or something along those lines. However, I think it’s unlikely that this would happen in situations of intense suffering, though perhaps not impossible in certain very specific situations?Even with this theoretical risk, it seems to me that it’s less of a concern than stopping prioritizing these situations of suffering, I would say.

Perhaps it might change your mind to know that I want agents who are suffering more than I am to be prioritized over me, even though I am suffering too, albeit to a lesser extent?I imagine this is also the case for others who care about consciousness and suffering?

Épiphanie Gédéon

For your first point, I think your point is that intense suffering is so aversive that no agent would start cheating by actually feeling it in order to be prioritized? In which case, I would argue that the thing "cheating" is not the agent itself, but the evolution algorithm beneath, the selection pressure that shapes the behavior of your agent (either at the genetic or memetic level), and so doesn't care about what the cost actually feels like.

As for your second one, I am wondering if my changing to cooperationism is mostly selfish, switching to moral frameworks that are useful for me, and trying to bypass cooperating with others on matters I do not need. Like, the moment I feel less pain and less depressed, I switch to a framework that emphasizes those way less.

This circles back to a central question about discussing morality and ethics, whether it is useful to begin with.

I feel a pull, when discussing cooperationism and consciousness, to just accept that my value set is weird and is not going to be recognized or accepted, as valid as the arguments seems to me. That it's okay not to enter conflationary aliances that do not represent your values.

Of course when saying this, I have freeloading concerns, and would need to think more through how I am benefiting from this consciousness-conflationary alliance and if there are ways I want to repay them before declaring departure from it.

So I guess this is something I'm still confused about. Should you discuss your values? Argument about them? Should the fact that many people find cooperationism unappealing update me toward it not working? Or should I not care?

Victors

It seems to me that it’s a good idea to discuss things and share your reasoning. It makes you more grounded, less dogmatic, and less prone to acting rashly.

Épiphanie Gédéon

Right, I'm kind of seeing this as reasoning-as-conflationary alliance. I see objectivity in ethics and moral as a belief in a schelling point where we could all arrive at if we just took the time to think more through the questions.

I feel like I used to tunnel vision on trying to communicate my value set to others, on trying to cooperate with them so they would cooperate with me in turn. But this doesn't actually work, they just see it as "you behaving morally".

This has been made especially poignant to me when reading Peter Gerdes arguing that everyone should spend their times in Jhana, that feeling good is a moral imperative. I feel the same sort of pointlessness when reading Yudkowsky describe how in dath ilan, it is your duty not to reproduce if you're unhappy. Something like the negociation and trading is not going to happen with them, they are ready to override me and my own preferences directly.

I wish we could just value trade explicitly. I haven't seen people do so explicitly, yet. "Your values are so weird, let's value trade! I care a bit more about LLMs and treating them well, and you care a bit more about animal welfare". This sort of cooperation has deep value-in-itself to me.

Final questions

Victors

To wrap up, I’m curious — what stands out to you, and what questions do you think still need further consideration from your perspective? Or what have you taken away from this discussion?

I wonder if there isn’t some sort of observer bias: studying consciousness seems to me likely to be biased by the fact that we do so from within our own consciousness (or the consciousness of the observing system) The act of observing a system may be intrinsically dependent on the underlying system from which we observe. The observer and the object are the same type of entity — is there no directly accessible objective viewpoint? But potentially a path to objectivity through the ‘alignment’ of different systems, though I’m not sure how.

Furthermore, with our conceptual tools and methods of observation necessarily limited by our own consciousness, we might miss essential aspects of consciousness in other beings. Faced with these irreducible epistemological limits: a permanent moral uncertainty that must be accepted.

Épiphanie Gédéon

So, if I try to model actual cruxes that are still leftover for me, things that would surprise me, the main one I see involves suicide and not wanting to live.

The clearest example of "green" I would see is if there were experiences where a large majority of agent would rather die than endure them. Not just "prefers to die" while in it, but even after having lived it, would prefer to shut off.

This would be quite shocking to me, I have a strong intuition around survival and existing as being fundamental, things you only trade away if you have other concepts or agents or things you delegate to enough that it is worth trading for. But that's something I could see.

I do give some credence to a model where pain can be deeply traumatic, where the thought experiment of "living through extreme pain and afterward you're healed back and fine" is self-contradictory. In that model, it wouldn't be so shocking I suppose.

More concretely, in the real world: my working model of people "wanting to die" is that they're either hopeless about things ever improving, unable to think clearly through the pain, or in the case of semi-religious people unaware of alternatives.

Imagine a city where cryonics is widely known and accepted, everyone knows someone who's signed up, it's completely commonplace... and you still have a substantial share of the population not signing up. That would genuinely update me, "no this is just the way some people are". Maybe it actually is the case, now that I think about it.

Victors

Perhaps you could imagine the opposite? A world where everyone is signed up by default, and see how many opt out?

Épiphanie Gédéon

Right that would be the true test. It is hard to imagine in suffient details though, since getting to such a society seems extremely improbable.

Maybe one thing that I could look into is what's the fraction of Christians who are still signed up for organ donation. How many of them signed out, and their whole relationship to it.

To come back to the main point, I think one thing we're currently lacking is the notion of basline. What's positive and negative, what are worlds that "should not exist". I guess one way to operationalize that is that if something is negative you wouldn't want to live through it you'd rather not experience anything than experiencing it. This also ties into questions of what I think about Omelas, what I would do if it existed and if I would sign up. I'm thinking mostly no, but it depends on population ethics questions that seem still unresolved to me.

Victors

Thank you for your time and the discussion.

Épiphanie Gédéon

Likewise!

Original source

lesswrong.com

https://www.lesswrong.com/posts/Cht6Cfj3byXmPbuzR/does-consciousness-and-suffering-even-matter-llms-and-moral

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudemodeltraining

ReleasesFresh

iQIYI Launches Nadou Pro, Chinas First AI Agent for Professional Film and TV Production - The Tribune

iQIYI Launches Nadou Pro, Chinas First AI Agent for Professional Film and TV Production The Tribune

GNews AI India

1mabout 2 hours ago

ModelsFresh

Context as a Resource: Why “More Information” Isn’t Always Better

Why a language model sometimes performs worse when you give it more information — and how to use context sparingly More context does not always mean better result Imagine you need a model to write something based on a client’s brief. You copy the project brief into the chat, add the entire email thread for the task, and upload files with previous comments — so the model is “in the loop.” The logic seems impeccable: the more information, the more precise the result. It feels intuitively obvious — why bother checking? And yet, if you do check, you’ll find that the result you get is precisely the opposite: the more information you give the model, the worse it performs. And not sporadically — it happens systematically; the effect is reproducible, and its causes lie in the architecture of LLMs.

Generative AI

8mabout 4 hours ago

ProductsFresh

Modern RAG in 2026: The Components That Actually Matter

A production-first breakdown of the real RAG stack: ingestion, parsing, metadata, chunking, retrieval, reranking, citations, freshness… Continue reading on Generative AI »

Generative AI

1mabout 3 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 160 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

Context as a Resource: Why “More Information” Isn’t Always Better

Generative AI

8mabout 4 hours ago

Models

Ant International Open Sources Time-Series Transformer AI Model to Enable More Businesses to Benefit from AI-Powered Forecasting - Yahoo Finance

Ant International Open Sources Time-Series Transformer AI Model to Enable More Businesses to Benefit from AI-Powered Forecasting Yahoo Finance

GNews AI transformer

1m5 months ago

ModelsRecent

Do You Need to Be Conscious to Matter? On LLMs and moral relevance

LessWrong AI

36mabout 12 hours ago

ModelsLive

Sadly, The Whispering Earring

The Whispering Earring (which you should read first) explores one of the most dystopic-utopic scenarios. Imagine you could achieve all you've ever wanted by just giving up your agency. While theoretically this seems rather undesirable, in practice you get double benefits: that enviable high-status having-done-things reputation, without having to do all that scary failure-prone responsibility-taking. Just don't tell anyone you have the earring, otherwise the status points gained are void. Of course the fact that you're cheating takes away most of the satisfaction of winning too, but it's still better than not winning. Moloch says: sacrifice what you love, and I will grant you victory. Anyway, I've been using Claude chat as an enhanced diary for the past couple of months. I've been incredibl

LessWrong AI

4m17 minutes ago