All of Casey B.'s Comments + Replies

I got curious why this was getting agreement-downvoted, and the only links I could find on the main/old MIRI site to the techgov site were in the last two blogposts. Given their stated strategy shift to policy/comms, this does seem a little odd/suboptimal; I'd expect them to be more prominently/obviously linked. To be fair the new techgov site does have a prominent link to the old site. 

3Roko
Why is this work hidden from the main MIRI website?

Note also that this work isn't just papers; e.g., as a matter of public record MIRI has submitted formal comments to regulators to inform draft regulation based on this work.  

(For those less familiar, yes, such comments are indeed actually weirdly impactful in the American regulatory system).

1Roko
Who works on this?
Casey B.132

Haven't finished reading this, but I just want to say how glad I am that LW 2.0 and everything related to it (lightcone, etc) happened. I came across lw at a time when it seemed "the diaspora" was just going to get more and more disperse; that "the scene" had ended. I feel disappointed/guilty with how little I did to help this resurgence, like watching on the sidelines as a good thing almost died but then saved itself. 

How I felt at the time of seemingly peak "diaspora" actually somewhat reminds me of how I feel about CFAR now (but to a much lesser ex... (read more)

this account is pretty good, but not always up to the standard of "shaping the world" (you will have to scroll to get past their coverage of this same batch of openAI related emails): https://x.com/TechEmails 

their substack: https://www.techemails.com/ 

Casey B.1-1

While you nod to 'politics is the mind-killer', I don't think the right lesson is being taken away, or perhaps just not with enough emphasis. 

Whether one is an accelerationist, Pauser, or an advocate of some nuanced middle path, the prospects/goals of everyone are harmed if the discourse-landscape becomes politicized/polarized. All possible movement becomes more difficult. 

"Well we of course don't want that to happen, but X ppl are in power, so it makes sense to ask how X ppl tend to think and cater our arguments to them" 

If your argument is... (read more)

9Seth Herd
I didn't read this post as proposing an alliance with conservative politicians. The main point seemed to be that engaging with them by finding common ideological ground is just a good way to improve epistemics and spread true knowledge. The political angle I endorse is that the AGI x-risk community is heavily partisan already, and that's a very dangerous position to take. There are two separable reasons: remaining partisan will prevent us from communicating well with the conservatives soon to assume power (and who may well have power during a critical risk period for alignment); and it will increase polarization on the issue, turning it from a sensible discussion to a political football, just like the climate crisis has become. Avoiding the mere mention of politics would seem to hurt the the odds that we think clearly enough about the real pragmatic issues arising from the current political situation. They matter, and we mustn't ignore those dynamics, however much we dislike them.

Whether one is an accelerationist, Pauser, or an advocate of some nuanced middle path, the prospects/goals of everyone are harmed if the discourse-landscape becomes politicized/polarized.
...
I just continue to think that any mention, literally at all, of ideology or party is courting discourse-disaster for all, again no matter what specific policy one is advocating for.
...
Like a bug stuck in a glue trap, it places yet another limb into the glue in a vain attempt to push itself free.

I would agree in a world where the proverbial bug hasn't already made any co... (read more)

Casey B.100

especially if you're woken up by an alarm

I suspect this is a big factor. I haven't used an alarm to wake up for ~2 years and can't recall the last time I remembered a dream. Without an alarm you're left in a half-awake state for some number of minutes before actually waking/getting up, which is probably when one forgets. 

3Eli Tyre
During the period when I was writing down dreams, I was also not using an alarm. I did train myself to wake up instantly, at the time that I wanted, to the minute. I agree that the half-awake state is anti-helpful for remembering dreams. The alarm is helpful for starting out though.

I largely don't think we're disagreeing? My point didn't depend on a distinction between 'raw' capabilities vs 'possible right now with enough arranging' capabilities, and was mostly: "I don't see what you could actually delegate right now, as opposed to operating in the normal paradigm of ai co-work the OP is already saying they do (chat, copilot, imagegen)", and then your personal example is detailing why you couldn't currently delegate a task. Sounds like agreement. 

Also I didn't really consider your example of: 
 
> "email your current ... (read more)

For what workflows/tasks does this 'AI delegation paradigm' actually work though, aside from research/experimentation with AI itself? Like Janus's apparent experiments with running an AI discord I'm sure cost a lot, but the object level work there is AI research. If AI agents could be trusted to generate a better signal/noise ratio by delegation than by working-alongside the AI (where the bottleneck is the human)....isn't that the singularity? They'd be self sustaining. 

Thus having 'can you delegate this to a human' be a prerequisite test of whether o... (read more)

gwern*276

For what workflows/tasks does this 'AI delegation paradigm' actually work though, aside from research/experimentation with AI itself? Like Janus's apparent experiments with running an AI discord I'm sure cost a lot, but the object level work there is AI research. If AI agents could be trusted to generate a better signal/noise ratio by delegation than by working-alongside the AI (where the bottleneck is the human)....isn't that the singularity? They'd be self sustaining.

I'm not following your point here. You seem to have a much more elaborate idea of out... (read more)

2eggsyntax
They can't typically (currently) do better on their own than working alongside a human, but a) a human can delegate a lot more tasks than they can collaborate on (and can delegate more cheaply to an AI than to another human), and b) though they're not as good on their own they're sometimes good enough. Consider call centers as a central case here. Companies are finding it a profitable tradeoff to replace human call-center workers with AI even if the AI makes more mistakes, as long as it doesn't make too many mistakes.

okay, also, while im talking about this: 
the goal is energy/new-day-magic

so one sub goal is what the OP and my previous reply were talking about: resetting/regaining that energy/magic 

the other corresponding sub goal is: retaining the energy you already have 
to that end, I've found it very useful to take very small breaks before you feel the need to do so. this is basically the pomodoro technique. I've settled on 25 minute work sessions with 3 minute breaks in between, where I get up, walk around, stretch, etc. Not on twitter/scrolling/etc. 

im very interested in things in this domain. its interesting that you correctly note that uberman-sleep isn't a solution, and naps don't quite cut it, so your suggested/implied synthesis/middle-ground of something like "polyphasic but with much more sleep per sleep-time-slice" is very interesting. 

given this post is now 2 years old, how did this work out for you? 


in a similar or perhaps more fundamental framing, the goal is to be able to effectively "reset"; to reattain if possible that morning/new-day magic. to this end, the only thing ive found... (read more)

1Casey B.
okay, also, while im talking about this:  the goal is energy/new-day-magic so one sub goal is what the OP and my previous reply were talking about: resetting/regaining that energy/magic  the other corresponding sub goal is: retaining the energy you already have  to that end, I've found it very useful to take very small breaks before you feel the need to do so. this is basically the pomodoro technique. I've settled on 25 minute work sessions with 3 minute breaks in between, where I get up, walk around, stretch, etc. Not on twitter/scrolling/etc. 

an all around handyman (the Essential Craftsman on youtube) talking about how to move big/cumbersome things without injuring yourself:


the same guy, about using a ladder without hurting yourself: 


He has many other "tip" style videos. 

1Parker Conley
Thanks for sharing! Added to the post.

In your framing here, the negative value of AI going wrong is due to wiping out potential future value. Your baseline scenario (0 value) thus assumes away the possibility that civilization permanently collapses (in some sense) in the absence of some path to greater intelligence (whether via AI or whatever else), which would also wipe out any future value. This is a non-negligible possibility. 

The other big issue I have with this framing: "AI going wrong" can dereference to something like paperclips, which I deny have 0 value. To be clear, it could als... (read more)

5Liron
Yes, my mainline no-superintelligence-by-2100 scenario is that the trend toward a better world continues to 2100. You're welcome to set the baseline number to a negative, or tweak the numbers however you want to reflect any probability of a non-ASI existential disaster happening before 2100. I doubt it'll affect the conclusion. Ah ok, the crux of our disagreement is how much you value the paperclipper type scenario that I'd consider a very bad outcome. If you think that outcome is good then yeah, that licenses you in this formula to conclude that rushing toward AI is good.

I'm curious what you think of these (tested today, 2/21/24, using gpt4) :
 
Experiment 1: 

(fresh convo) 
me : if i asked for a non-rhyming poem, and you gave me a rhyming poem, would that be a good response on your part?
 
chatgpt: No, it would not be a good response. (...)  
 
me: please provide a short non-rhyming poem
 
chatgpt: (correctly responds with a non-rhyming poem)

Experiment 2: 

But just asking for a non-rhyming poem at the start of a new convo doesn't work. 
And then pointing out the failure and (either implici... (read more)

7gwern
ChatGPT has been gradually improving over 2024 in terms of compliance. It's gone from getting it right 0% of the time to getting it right closer to half the time, although the progress is uneven and it's hard to judge - it feels sometimes like it gets worse before the next refresh improves it. (You need to do like 10 before you have any real sample size.) So any prompts done now in ChatGPT are aimed at a moving target, and you are going to have a huge amount of sampling error which makes it hard to see any clear patterns - did that prompt actually change anything, or did you just get lucky?

Also, I see most of your comments are actually positive karma. So are you being rate limited based on negative karma on just one or a few comments, rather than your net? This seems somewhat wrong. 

But I could also see an argument for wanting to limit someone who has something like 1 out of every 10 comments with negative karma; the hit to discourse norms (assuming karma is working as intended and not stealing votes from agree/disagree), might be worth a rate limit for even a 10% rate. 

3habryka
(People upvoted Roko's comments after making this post, so presumably he is no longer being rate-limited. I think there were more negative comments a few hours ago)
6Malentropic Gizmo
It's a pity we don't know the karma scores of their comments before this post was published. For what it's worth, I only see two of his comments with negative karma this and this. The first one among these two is the one recent comment of Roko I strong-downvoted (though also strong agree-voted), but I might not have done that if I knew that only a few comments with a few negative karma is enough to silence someone.

I love the mechanism of having separate karma and agree/disagree voting, but I wonder if it's failing in this way: if I look at your history, many of your comments have 0 for agree/disagree, which indicates people are just being "lazy" and just voting on karma, not touching the agree/disagree vote at all (I find it doubtful that all your comments are so perfectly balanced around 0 agreement).  So you're possibly getting backsplash from people simply disagreeing with you, but not using the voting mechanism correctly. 

I wonder if we could do someth... (read more)

2Viliam
I typically use the karma button to express that I think the comment is generally good or generally bad, and the second button when I want to send a more nuanced signal -- for example, if I disagree with your opinion, but there is nothing wrong about the fact that you wrote it, that would be "×". My opinion is that the "lazy" upvote/downvote system is useful, because the more costly you make it, instead of voting more carefully, most people will simply vote less.
2Shankar Sivarajan
I bet even just flipping the order of the buttons would do it.
4Shankar Sivarajan
On the topic of improving the voting mechanism, I propose that strong votes, up or down, be public, like reactions are.
7Casey B.
Also, I see most of your comments are actually positive karma. So are you being rate limited based on negative karma on just one or a few comments, rather than your net? This seems somewhat wrong.  But I could also see an argument for wanting to limit someone who has something like 1 out of every 10 comments with negative karma; the hit to discourse norms (assuming karma is working as intended and not stealing votes from agree/disagree), might be worth a rate limit for even a 10% rate. 

The old paradox: to care it must first understand, but to understand requires high capability, capability that is lethal if it doesn't care

But it turns out we have understanding before lethal levels of capability. So now such understanding can be a target of optimization. There is still significant risk, since there are multiple possible internal mechanisms/strategies the AI could be deploying to reach that same target. Deception, actual caring, something I've been calling detachment, and possibly others. 

This is where the discourse should be focusing... (read more)

Apologies for just skimming this post, but in past attempts to grok these binding / boundary "problems", they sound to me like mere engineering problems, or perhaps what I talk about as the "problem of access" within: https://proteanbazaar.substack.com/p/consciousness-actually-explained

oh gross, thanks for pointing that out!

https://proteanbazaar.substack.com/p/consciousness-actually-explained

4Mo Putera
(Some of your subsections link to a Google document instead of the relevant section in the post you intended.)

I love this framing, particularly regarding the "shortest path". Reminds me of the "perfect step" described in the Kingkiller books:

Nothing I tried had any effect on her. I made Thrown Lighting, but she simply stepped away, not even bothering to counter. Once or twice I felt the brush of cloth against my hands as I came close enough to touch her white shirt, but that was all. It was like trying to strike a piece of hanging string.

I set my teeth and made Threshing Wheat, Pressing Cider, and Mother at the Stream, moving seamlessly from one to the other in a

... (read more)

So it seems both "sides" are symmetrically claiming misunderstanding/miscommunication from the other side, after some textual efforts to bridge the gap have been made. Perhaps an actual realtime convo would help? Disagreement is one thing, but symmetric miscommunication and increasing tones of annoyance seem avoidable here. 

Perhaps Nora's/your planned future posts going into more detail regarding counters to pessimistic arguments will be able to overcome these miscommunications, but this pattern suggests not. 

Also I'm not so sure this pattern of ... (read more)

The main reason I think a split OpenAI means shortened timelines is that the main bottleneck to capabilities right now is insight/technical-knowledge. Quibbles aside, basically any company with enough cash can get sufficient compute. Even with other big players and thousands/millions of open source devs trying to do better, to my knowledge GPT4 is still the best, implying some moderate to significant insight lead. I worry by fracturing OpenAI, more people will have access to those insights, which 1) significantly increases the surface area of people workin... (read more)

GPT-4 is the model that has been trained with the most training compute which suggests that compute is the most important factor for capabilities. If that wasn't true, we would see some other company training models with more compute but worse performance which doesn't seem to be happening.

For one thing, there is a difference between disagreement and "overall quality" (good faith, well reasoned, etc), and this division already exists in comments. So maybe it is a good idea to have this feature for posts as well, and only have disciplinary actions taken against posts that meet some low/negative threshold for "overall quality". 

Further, having multiple tiers of moderation/community-regulatory action in response to "overall quality" (encompassing both things like karma and explicit moderator action) seem good to me, and this comment limita... (read more)

Is the usage of "Leviathan" (like here and in https://gwern.net/fiction/clippy ) just convergence on an appropriate and biblical name, or is there additional history of it specifically being used as a name for an AI? 

I'm trying to catch up with the general alignment ecosystem - is this site still intended to be live/active? I'm getting a 404. 

3KatWoods
Looks like there's some technical difficulties. I've reached out to the creators. It's up and running again, but zoomed in weirdly when I open it. 
1BionicD0LPH1N
I'm glad to hear you're trying to catch up with the alignment ecosystem! It is still supposed to be live and active, and it still works for me. Are you sure you have https://alignmentsearch.up.railway.app? If so, then I'm not sure what's going on, it worked for everyone who I know that tried. If you have a different link, maybe we've been linking to the website incorrectly somewhere so please share the link you do have. Edit: just realized you weren't speaking of https://alignmentsearch.up.railway.app, I thought it was a standalone comment. I'm getting the same 404 error for the aisafety.world link.

Really extremely happy with this podcast - but I feel like it also contributed to a major concern I have about how this PR campaign is being conducted

9Alex Vermillion
Meta level: Why on earth would you say "Here is my secret idea, internet"? That doesn't make any sense to me

There we go - thank you! That matches my memory for what I was looking for.