I'm writing a book about epistemology. It's about The Problem of the Criterion, why it's important, and what it has to tell us about how we approach knowing the truth.
I've also written a lot about AI safety. Some of the more interesting stuff can be found at the site of my currently-dormant AI safety org, PAISRI.
Sadly, it's too abstract a warning shot.
I think a real warning shot that's actually registered as such by the public and politicians would have to be something that involves a lot of people dying or a lot of economic damage. Otherwise, I have a hard time seeing a critical mass of people finding motivation to act.
Well, as I say in my example above, literally build a bot that plays a game.
Most of the loops end up much shorter, though, like "upgrade this package dependency, keep fixing bugs in the build until the build passes", but sometimes these changes are kinda weird, so I try to get Claude to do what a human would do, which is keep trying things it thinks might work to get the build to pass.
Or, one I haven't done but might: keep adding tests until we hit X% coverage (and give some examples of what constitutes a good test). This one I expect to work better than you might think, since Opus is getting reasonably good at not specification gaming and trying to actually do what I mean, which Sonnet frequently still goes for.
You need a clear measure. For example, let's say you want to build a scripted bot that can play a novel game for which there is not an off the self solution. You could try to train a neural net, but Claude can write code, so you fill in Y with "writing a bot that plays game Z".
This sort of strategy is obviously heavily dependent of the availability of a good evaluation method and a clear scoring mechanism. As such, it doesn't work for most problems, since most problems don't have such large search spaces.
Yes, Opus 4.5.
Perhaps a point of terminology, I'd say vibestemics is itself about the fact that your epistemics, whatever they are, are grounded in vibes (via care). However, this is tangled up with the fact that to believe that this core vibestemic claim is true is to automatically imply that there is no one right epistemic process, but rather epistemic processes that are instrumentally useful depending on what you care about doing (hence the contingency on care).
The specific vibe of the post-rationality is, as I would frame it, to value completeness over consistency, whereas traditional rationality makes the opposite choice (and pre-rationality doesn't even try to value either, except in that it will try to hallucinate its way to both if pressed).
Giving Claude looping instructions can be quite useful. But I never go full Ralph Wiggum!
For example, here's a paraphrase of a loop I had Claude run recently with --dangerously-skip-permissions:
keep iterating on this code in a loop. think of yourself as a scientist. come up with hypotheses, run experiments, see what works, and iterate. keep going until we get at least a score of X at task Y. i know it's possible, you can do this, i believe in you, let's go!
5 hours of clock time later it had done very well. :-)
I don't have great faith in the epistemics of postrats as they exist today.
Yeah, you and me both.
I've said this elsewhere before, but in hindsight it was a mistake for us to promote terms like "postrationality" and "metarationality" to the point of fixation. They're exactly the type of words that invite pre/post confusion and allows pre-rats to masquerade as post-rats if there's insufficient gatekeeping (and there usually is).
And yet, there's something in the desire of folks like myself to point to a place that says "hey, I think rationalists are doing a lot of things right, but are screwing up in fundamental ways that are contrary to the vibe of rationality, and it's useful to give that thing a name so we can easily point to it".
In my ideal world, people would be trained in rationality-as-it-exists-today first, and then be trained in the limits of those methods so they know how to transcend them safely when they break down. Then post-rat would really mean something: one who fully trained as a rationalist, and then used that as the bedrock on which to learn how to handle the situations the methods of rationality are not good at dealing with.
Some people will argue that's just rationality, and sure maybe it is some ideal version of rationality as proposed in The Sequences, but as I see it, actual rationalists screw up in predictable ways, those ways are related to the rationalist vibe, and thus the internal experience must be to transcend that vibe, whatever we want to label it.
That's my hope.
I'm saying he's right to say that the friction is unresolvable. I'm sure it does feel mysterious, but it's actually very straightforward in a way that's difficult to explain unless you can already see it. I wish it wasn't like this. But it's true that you both keep developing and that you've attained something (and the thing you attain is, in part, realizing there was nothing to attain in the first place!). Steve Byrnes has done a decent job of trying to explain it as well as anyone has.
It also sounds like you're saying that, even after reaching enlightenment, you'll still have mental habits that become maladaptive over time. That's interesting, it wasn't my impression of what enlightenment was like.
"Enlightenment" is a rather slippery word, and I've of the opinion that some people are intentionally slippery about it because it benefits them. So some people use the word to mean both that you've had a persistent realization of non-duality (that's what PNSE is about) and that you are free of all preconditioned reactions (you're liberated). But almost no one is fully liberated (and to stay that way you seemingly have to live a very constrained life that protects you from interacting with the world), and lots of people are in PNSE, so it's probably more useful for "enlightenment" to be about PNSE than it is for it to be about a state of PNSE plus zero reactivity.
(Policing who can claim to be "enlightened" is a centuries old issue in Buddhism, and many lineages have developed social systems to deal with it.)
Yes. Specifically I was building agents to play games as part of a beta with SoftMax.