Gareth Davidson — LessWrong

LESSWRONG
LW

Replying toEmergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Isn't it that it just conflates everything it learned during RLHF, and it's all coupled very tightly and firmly enforced, washing out earlier information and brain-damaging the model? So when you grab hold of that part of the network and push it back the other way, everything else shifts with it due to it being trained in the same batches.

If this is the case, maybe you can learn about what was secretly RLHF'd into a model by measuring things before and after. See if it leans in the opposite direction on specific politically sensitive topics, veers towards people, events or methods that were previously downplayed or rejected. Not just deepseek refusing to... (read more)

Replying toMusings on Cargo Cult Consciousness

Gareth Davidson2y

Musings on Cargo Cult Consciousness

Thanks for the decent criticism!

I don't think it's quite right to say the idea of the universe being in some sense mathematical is purely a carry-over of Judeo-Christian heritage--what about the Greek atomists like Leucippus and Democritus for example?

From what I'm aware, the teachings of Greek classics in Christian schools made the two cultures rather closely aligned; the rationalist traditions have firm roots in Greek philosophy, including standards of evidence, court as argumentation, even democracy itself. Aristotle and the likes were required reading during the hundreds of years of the evolution of the Western university education system. I'm a bit ignorant of the details there in all honesty, but I think today's... (read 699 more words →)

Replying toMusings on Cargo Cult Consciousness

Gareth Davidson2y

Musings on Cargo Cult Consciousness

No but I'm gonna put this and some other writing on a GitHub-based Jeckyl blog I think. I've been bitten by web 2.0 a few times and lost my work. I've got quite a few unorthodox ideas that I'd like to build into articles, dunno how much overlap there is:

What a rotation is has been bugging me for too long. I mean, wtf is it? I'd like to go into more about that once I understand it. But might need another decade thinking about it.
Universal Metaethics needs expansion I think, and an argument for moral relativism and not judging others too harshly by your own values because everyone's wrong, it's just a

Gareth Davidson2y

Musings on Cargo Cult Consciousness

Thank you. Sorry I didn't reply until now, the downvotes on this piece mean I can only post one comment a day. I guess "facts" only exist within a system, reached from its axioms. If everything is made of feelings then objective reality and facts about it actually emerge when there's enough consensus of mass to make them so probable that they are extremely close to true at this scale. Empiricism gives us a good way to explore these sorts of things, but says nothing about the feelings that underpin reality.

Your comment actually makes me think about subsystems of "reality" that are detached from it, which I guess is what we all are, and the "feels over facts" masses are a form of locally isolated realities that have their own "truths" ... I won't lie, this makes me pretty uncomfortable, but it's interesting nonetheless.

Replying toBrute Force Manufactured Consensus is Hiding the Crime of the Century

Gareth Davidson2y

Brute Force Manufactured Consensus is Hiding the Crime of the Century

A bit off-topic, but one of my favourite positions is the "which cultural mores could allow this to happen?" angle. To get a feel for that you've got to think about what role vaccines play. They're part of health infrastructure and a matter of economic security, so enemy nations work hard to undermine these systems. An outbreak of measles and a load of blindness in the West would be great news for, say, Russia, so they spend to attack Western vaccination programmes. We do similar things like ship heroin from Afghanistan into Russia, while China floods Western streets with research chemicals, it's all part of the game.

So we in turn protect these... (read more)

Replying toLiterally Everything is Infinite

Gareth Davidson2y

Literally Everything is Infinite

I present the opposite view in my criticism of infinities. Infinite claims require infinite evidence!

Replying toMusings on Cargo Cult Consciousness

Gareth Davidson2y

Musings on Cargo Cult Consciousness

What about the evolution of nervous systems needing will at the bottom? The guy in Searle's Chinese Room? I think I should probably work on those a bit. And be a bit more charitable towards Hofstadter too.

Replying toMusings on Cargo Cult Consciousness

Gareth Davidson2y

Musings on Cargo Cult Consciousness

I'm not trying to explain meaninglessness, the point is to put forward a position that is actually compatible with the facts of the evolution of nervous systems, in as simple terms as possible, then using that to explore the impossibility of consciousness on transistors. And to also explain that the reason computationalism is palettable, is due to cognitive biases built into our culture that we inherited from Christian Dualism.

If I failed at that I'd appreciate some feedback. I'm guessing it's because I underestimated how much hatred is involved in the US culture war and comparing rationalists with Christian mythology gets people's backs up.

Replying toMusings on Cargo Cult Consciousness

Gareth Davidson2y

Musings on Cargo Cult Consciousness

Tagged as "criticisms of the rationalist movement" before anyone even read it? I think that's rather uncharitable. Is exploring cognitive biases carried over by our Christian heritage too sacred a topic?

Musings on Cargo Cult Consciousness

Gareth Davidson

Like many of us, I once dreamt I'd live long enough to upload my mind–one Planck at a time–to live happily ever after in a digital heaven. This is a dream now dead. Crushed in a head on collision with logic and reason, its twisted wreck revealed a nightmare that threatens all future mind. So in its wake, my misery and I invite you on this same journey, we could certainly use the company.

Before setting off, I should probably point out a few things:

There's a lot of hyperbolic argument and weak analogy in this article, and it comes across as combative. I could be more agreeable, but that's not me, I'm having

... (read 4881 more words →)

-13

Replying toSam Altman's sister claims Sam sexually abused her -- Part 1: Introduction, outline, author's notes

Gareth Davidson2y

Sam Altman's sister claims Sam sexually abused her -- Part 1: Introduction, outline, author's notes

To share an alternate anecdote, a friend of mine was accused by a family member of abuse as a child, which turned out to be a false memory created during a severe and prolonged period of mental illness. Ten years after she apologised and says she doesn't believe it happened, he still finds it difficult to forgive her and has mental health issues caused by the stigma (not that there was any really, she made a lot of other extremely unlikely clams)

Not this this influences my position from the default stance of "dunno", but I thought I'd share for balance.