LESSWRONG
LW

All of Nicholas / Heather Kross's Comments + Replies

Something else I just realized: Georgism is a leftish idea that recognizes some (but not all) leftish ideas I've discussed or referenced above, and its modern form is currently rationalist-adjacent. Progress!

2Mateusz Bagiński1mo

In what sense is Georgism "leftish"?

4Viliam2mo

You could interpret that as "rationalists are moving to the left... expect more leftism in the future", or you could interpret that as "rationalists are choosing the parts that seem correct to them (and rejecting the parts that don't)... expect more cherry-picking from all directions in the future".

We Fell For It

Nicholas / Heather Kross2mo42

Ah, sorry yeah I think it was a mistake on my part to mostly make the post a verbatim Discord reply. Lots of high-context stuff that I didn't explain well.

This specific part is (in my usage/interpretation; if you click the link, the initial context was an Emmett Shear tweet) basically a shorthand for one or more "basic" leftist views, along the lines of these similar-but-somewhat-distinct claims:

Capitalism more-reliably rewards power-maximizers than social-utility-maximizers.
Under capitalism and similar incentive-structures, we'd expect conflict theory

... (read more)

-1sloonz2mo

I think it was correct to ask the question, because, no, sorry, I don’t think you can equate that specific understanding of "property is downstream of power" with the same level of consensus/reasonableness as "prices are downstream of supply and demand", whereas the understanding given by Viliam is pretty much inside of what I would expect to be as consensual as "prices are downstream to supply and demand". For example, regarding your second point : I think most economists would reply that that distinction is fallacious when considering creative destruction, both mistake (having a more correct view of the market and business organization) and conflict (and thereby driving your competitors bankrupt, forcing the reallocation of their resources elsewhere) as the two sides of the same coin. And "innovation" not being a strong path to successful outcomes in capitalism ? Do you really think it is going to be consensual outside of leftist spheres ? Because that’s what the overall quote/tweet is about, right, creating a common understanding across political chasms under some reasonable assumptions ? First point is really interesting tho. It’s one my libertarian self is screaming to reject, but the rationalist me readily accepts. With a caveat : I don’t think "Capitalism" does a lot of work here. Almost all (and all known ?) societal organizations do that. Which, conceded, does not it makes less problematic.

We Fell For It

Nicholas / Heather Kross2mo22

In hindsight, I over-updated on my previous success with a poorly-written angry short post with a clickbait title and lots of inline links criticizing the rationality community. Oops.

Rationalist Movie Reviews

Nicholas / Heather Kross2mo20

I said "one of the best movies about", not "one of the best movies showing you how to".

NicholasKross's Shortform

Nicholas / Heather Kross3mo31

The punchline is "alignment could productively use more funding". Many of us already know that, but I felt like putting a mildly-opinionated spin on what kind of things, at the margin, may help top researchers. (Also I spent several minutes editing/hedging the joke)

NicholasKross's Shortform

Nicholas / Heather Kross3mo31

Virgin 2030s [sic] MIRI fellow:
- is cared for so they can focus on research
- has staff to do their laundry
- soyboys who don't know *real* struggle
- 3 LDT-level alignment breakthroughs per week

CHAD 2010s Yudkowsky:
- founded a whole movement to support himself
- "IN A CAVE, WITH A BOX OF SCRAPS"
- walked uphill both ways to Lightcone offices.
- alpha who knows *real* struggle
- 1 LDT-level alignment breakthrough per decade

3Nicholas / Heather Kross3mo

Why and When Interpretability Work is Dangerous

Nicholas / Heather Kross4mo20

Kinda, my current mainline-doom-case is "some AI gets controlled --> powerful people use it to prop themselves up --> world gets worse until AI gets uncontrollably bad --> doom". I would call it a different yet also-important doom case of "perpetual low-grade-AI dictatorship where the AI is controlled by humans in a surveillance state".

An AI crash is our best bet for restricting AI

Nicholas / Heather Kross4mo31

EDIT: Due to the incoming administration's ties to tech investors, I no longer think an AI crash is so likely. Several signs IMHO point to "they're gonna go all-in on racing for AI, regardless of how 'needed' it actually is".

7Remmelt4mo

I'm also feeling less "optimistic" about an AI crash given: 1. The election result involving a bunch of tech investors and execs pushing for influence through Trump's campaign (with a stated intention to deregulate tech). 2. A military veteran saying that the military could be holding up the AI industry like "Atlas holding the globe", and an AI PhD saying that hyperscaled data centers, deep learning, etc, could be super useful for war. I will revise my previous forecast back to 80%+ chance.

An AI crash is our best bet for restricting AI

Nicholas / Heather Kross5mo52

For more details on (the business side of) a potential AI crash, see recent articles by the blog Where's Your Ed At, which wrote the sorta-well-known post "The Man Who Killed Google Search".

For his AI-crash posts, start here and here and click on links to his other posts. Sadly, the author falls into the trap of "LLMs will never get to reasoning because they don't, like, know stuff, man", but luckily his core competencies (the business side, analyzing reporting) show why an AI crash could still very much happen.

3Nicholas / Heather Kross4mo

AI #73: Openly Evil AI

Nicholas / Heather Kross8mo31

Further context on the Scott Adams thing lol: He claims to have taken hypnosis lessons decades ago and has referred to using it multiple times. His, uh, personality also seems to me like it'd be more susceptible to hypnosis than average (and even he'd probably admit this in a roundabout way).

Nicholas / Heather Kross10mo20

Further observation about that second sentence.

Nicholas / Heather Kross10mo30

I think deeply understanding top tier capabilities researchers' views on how to achieve AGI is actually extremely valuable for thinking about alignment. Even if you disagree on object level views, understanding how very smart people come to their conclusions is very valuable.

I think the first sentence is true (especially for alignment strategy), but the second sentence seems sort of... broad-life-advice-ish, instead of a specific tip? It's a pretty indirect help to most kinds of alignment.

Otherwise, this comment's points really do seem like empirical thing... (read more)

2Nicholas / Heather Kross10mo

Further observation about that second sentence.

LessOnline (May 31—June 2, Berkeley, CA)

Nicholas / Heather Kross10mo20

I can't make this one, but I'd love to be at future LessOnline events when I'm less time/budget-constrained! :)

Nicholas / Heather Kross11mo20

First link is broken.

[This comment is no longer endorsed by its author]Reply

4lsusr11mo

Fixed. Thanks.

[+]Nicholas / Heather Kross11mo-5-8

LessOnline Festival Updates Thread

Nicholas / Heather Kross1y50

How scarce are tickets/"seats"?

4Ben Pace1y

I think on-site housing is pretty scarce, though we're going to make more high-density rooms in response to demand for that. Tickets aren't scarce, our venue could fit like a 700 person event, so I don't expect to hit the limits.

[April Fools' Day] Introducing Open Asteroid Impact

Nicholas / Heather Kross1y90

I will carefully hedge my investment in this company by giving it $325823e7589245728439572380945237894273489, in exchange for a board seat so I can keep an eye on it.

[April Fools' Day] Introducing Open Asteroid Impact

Nicholas / Heather Kross1y61

I have over 5 Twitter followers, I'll take my board seat when ur ready

Why I no longer identify as transhumanist

Nicholas / Heather Kross1y70

Giving up on transhumanism as a useful idea of what-to-aim-for or identify as, separate from how much you personally can contribute to it.

More directly: avoiding "pinning your hopes on AI" (which, depending on how I'm supposed to interpret this, could mean "avoiding solutions that ever lead to aligned AI occurring" or "avoiding near-term AI, period" or "believing that something other than AI is likely to be the most important near-future thing", which are pretty different from each other, even if the end prescription for you personally is (or seems, on fir... (read more)

Kaj_Sotala1y140

I'm unlikely to reply to further object-level explanation of this, sorry.

No worries! I'll reply anyway for anyone else reading this, but it's fine if you don't respond further.

Giving up on transhumanism as a useful idea of what-to-aim-for or identify as, separate from how much you personally can contribute to it.

It sounds like we have different ideas of what it means to identify as something. For me, one of the important functions of identity is as a model of what I am, and as what distinguishes me from other people. For instance, I identify as Finni... (read more)

Why I no longer identify as transhumanist

Nicholas / Heather Kross1y70

Yes, I think this post / your story behind it, is likely an example of this pattern.

5Kaj_Sotala1y

Okay! It wasn't intended as prescriptive but I can see it as being implicitly that. What do you think I'm rationalizing?

"How could I have thought that faster?"

Nicholas / Heather Kross1y50

That's technically a different update from the one I'm making. However, I also update in favor of that, as a propagation of the initial update. (Assuming you mean "good enough" as "good enough at pedagogy".)

"How could I have thought that faster?"

Nicholas / Heather Kross1y62

This sure does update me towards "Yudkowsky still wasn't good enough at pedagogy to have made 'teach people rationality techniques' an 'adequately-covered thing by the community'".

4quetzal_rainbow1y

Do you mean "If EY was good enough we would knew this trick many years ago"?

Why I no longer identify as transhumanist

Nicholas / Heather Kross1y145

Person tries to work on AI alignment.
Person fails due to various factors.
Person gives up working on AI alignment. (This is probably a good move, when it's not your fit, as is your case.)
Danger zone: In ways that sort-of-rationalize-around their existing decision to give up working on AI alignment, the person starts renovating their belief system around what feels helpful to their mental health. (I don't know if people are usually doing this after having already tried standard medical-type treatments, or instead of trying those treatments.)
Danger zone: Pers

... (read more)

4Kaj_Sotala1y

That makes sense to me, though I feel unclear about whether you think this post is an example of that pattern / whether your comment has some intent aimed at me?

Virtually Rational - VRChat Meetup

Nicholas / Heather Kross1y20

Group Debugging is intriguing...

Dreams of AI alignment: The danger of suggestive names

Nicholas / Heather Kross1y3-3

How many times has someone expressed "I'm worried about 'goal-directed optimizers', but I'm not sure what exactly they are, so I'm going to work on deconfusion."? There's something weird about this sentiment, don't you think?

I disagree, and I will take you up on this!

"Optimization" is a real, meaningful thing to fear, because:

We don't understand human values, or even necessarily meta-understand them.
Therefore, we should be highly open to the idea that a goal (or meta-goal) that we encode (or meta-encode) would be bad for anything powerful to base-level car

... (read more)

Aligned AI is dual use technology

Nicholas / Heather Kross1y40

Ah, yeah that's right.

Aligned AI is dual use technology

Nicholas / Heather Kross1y31

If it helps clarify: I (and some others) break down the alignment problem into "being able to steer it at all" and "what to steer it at". This post is about the danger of having the former solved, without the latter being solved well (e.g. through some kind of CEV).

Thane Ruthenis1y1812

Nah, I think this post is about a third component of the problem: ensuring that the solution to "what to steer at" that's actually deployed is pro-humanity. A totalitarian government successfully figuring out how to load its regime's values into the AGI has by no means failed at figuring out "what to steer at". They know what they want and how to get it. It's just that we don't like the end result.

"Being able to steer at all" is a technical problem of designing AIs, "what to steer at" is a technical problem of precisely translating intuitive human goals into a formal language, and "where is the AI actually steered" is a realpolitiks problem that this post is about.

Global LessWrong/AC10 Meetup on VRChat

Nicholas / Heather Kross1y20

Love this event series! Can't come this week, but next one I can!

Is principled mass-outreach possible, for AGI X-risk?

Nicholas / Heather Kross1y20

No worries! I make similar mistakes all the time (just check my comment history ;-;)

And I do think your comment is useful, in the same way that Rohin's original comment (which my post is responding to) is useful :)

Is principled mass-outreach possible, for AGI X-risk?

Nicholas / Heather Kross1y20

FWIW, I have an underlying intuition here that's something like “if you're going to go Dark Arts, then go big or go home”, but I don't really know how to operationalize that in detail and am generally confused and sad. In general, I think people who have things like “logical connectives are relevant to the content of the text” threaded through enough of their mindset tend to fall into a trap analogous to the “Average Familiarity” xkcd or to Hofstadter's Law when they try truly-mass communication unless they're willing to wrench things around in what are of

Nicholas / Heather Kross1y4-1

Now, I do separately observe a subset of more normie-feeling/working-class people who don't loudly profess the above lines and are willing to e.g. openly use some generative-model art here and there in a way that suggests they don't have the same loud emotions about the current AI-technology explosion. I'm not as sure what main challenges we would run into with that crowd, and maybe that's whom you mean to target.

That's... basically what my proposal is? Yeah? People that aren't already terminally-online about AI, but may still use chatGPT and/or StableDiff... (read more)

8Rana Dexsin1y

Facepalm at self. You're right, of course. I think I confused myself about the overall context after reading the end-note link there and went off at an angle. Now to leave the comment up for history and in case it contains some useful parts still, while simultaneously thanking the site designers for letting me un-upvote myself. 😛

Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Nicholas / Heather Kross1y20

Yeah, mostly agreed. My main subquestion (that led me to write the review, besides this post being referenced in Leake's work) was/sort-of-still-is "Where do the ratios in value-handshakes come from?". The default (at least in the tag description quote from SSC) is uncertainty in war-winning, but that seems neither fully-principled nor nice-things-giving (small power differences can still lead to huge win-% chances, and superintelligences would presumably be interested in increasing accuracy). And I thought maybe ROSE bargaining could be related to that.

The relation in my mind was less ROSE --> DT, and more ROSE --?--> value-handshakes --> value-changes --?--> DT.

Ways to buy time

Nicholas / Heather Kross1y20

(On my beliefs, which I acknowledge not everyone shares, expecting something better than "mass delusion of incorrect beliefs that implies that AGI is risky" if you do wide-scale outreach now is assuming your way out of reality.)

I'm from the future, January 2024, and you get some Bayes Points for this!

The "educated savvy left-leaning online person" consensus (as far as I can gather) is something like: "AI art is bad, the real danger is capitalism, and the extinction danger is some kind of fake regulatory-capture hype techbro thing which (if we even bother t... (read more)

Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Nicholas / Heather Kross1y20

So there's a sorta-crux about how much DT alignment researchers would have to encode into the-AI-we-want-to-be-aligned, before that AI is turned on. Right now I'm leaning towards "an AI that implements CEV well, would either turn-out-to-have or quickly-develop good DT on its own", but I can see it going either way. (This was especially true yesterday when I wrote this review.)

And I was trying to think through some of the "DT relevance to alignment" question, and I looked at relevant posts by [Tamsin Leake](https://www.lesswrong.com/users/tamsin-leake) (who... (read more)

2Anthony DiGiovanni1y

Sorry, to be clear, I'm familiar with the topics you mention. My confusion is that ROSE bargaining per se seems to me pretty orthogonal to decision theory. I think the ROSE post(s) are an answer to questions like, "If you want to establish a norm for an impartial bargaining solution such that agents following that norm don't have perverse incentives, what should that norm be?", or "If you're going to bargain with someone but you didn't have an opportunity for prior discussion of a norm, what might be a particularly salient allocation [because it has some nice properties], meaning that you're likely to zero-shot coordinate on that allocation?"

There is way too much serendipity

Nicholas / Heather Kross1y50

Selection Bias Rules (Debatably Literally) Everything Around Us

Even if we lose, we win

Nicholas / Heather Kross1y20

Currently, I think this is a big crux in how to "do alignment research at all". Debatably "the biggest" or even "the only real" crux.

(As you can tell, I'm still uncertain about it.)

Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Nicholas / Heather Kross1y44Review for 2022 Review

Decision theory is hard. In trying to figure out why DT is useful (needed?) for AI alignment in the first place, I keep running into weirdness, including with bargaining.

Without getting too in-the-weeds: I'm pretty damn glad that some people out there are working on DT and bargaining.

1Anthony DiGiovanni1y

Can you say more about what you think this post has to do with decision theory? I don't see the connection. (I can imagine possible connections, but don't think they're relevant.)

Bad at Arithmetic, Promising at Math

Nicholas / Heather Kross1y30Review for 2022 Review

Still seems too early to tell if this is right, but man is it a crux (explicit or implicit).

Terence Tao seems to have gotten some use out of the most recent LLMs.

The LessWrong 2022 Review

Nicholas / Heather Kross1y44

if you take into account the 4-5 staff months these cost to make each year, we net lost money on these

For the record, if each book-set had cost $40 or even $50, I still would have bought them, right on release, every time. (This was before my financial situation improved, and before the present USD inflation.)

I can't speak for everyone's financial situation, though. But I (personally) mentally categorize these as "community-endorsement luxury-type goods", since all the posts are already online anyway.

The rationality community is unusually good about not selling ingroup-merch when it doesn't need or want to. These book sets are the perfect exceptions.

Nicholas / Heather Kross1y20

to quote a fiend, "your mind is a labyrinth of anti-cognitive-bias safeguards, huh?"

[emphasis added]

The implied context/story this is from sure sounds interesting. Mind telling it?

"Dark Constitution" for constraining some superintelligences

Nicholas / Heather Kross1y86

I don't think of governments as being... among other things "unified" enough to be superintelligences.

Also, see "Things That Are Not Superintelligence" and "Maybe The Real Superintelligent AI Is Extremely Smart Computers".

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

Nicholas / Heather Kross1y20

The alignment research that is done will be lower quality due to less access to compute, capability knowhow, and cutting edge AI systems.

I think this is false, though it's a crux in any case.

Capabilities withdrawal is good because we don't need big models to do the best alignment work, because that is theoretical work! Theoretical breakthroughs can make empirical research more efficient. It's OK to stop doing capabilities-promoting empirical alignment, and focus on theory for a while.

(The overall idea of "if all alignment-knowledgeable capabilities people ... (read more)

Learning Math in Time for Alignment

Nicholas / Heather Kross1y30

Good catch! Most of it is hunches to be tested (and/or theorized on, but really tested) currently. Fixed

Nicholas / Heather Kross1y2-2Review for 2022 Review

"Exfohazard" is a quicker way to say "information that should not be leaked". AI capabilities has progressed on seemingly-trivial breakthroughs, and now we have shorter timelines.

The more people who know and understand the "exfohazard" concept, the safer we are from AI risk.

Optimality is the tiger, and agents are its teeth

Nicholas / Heather Kross1y20Review for 2022 Review

More framings help the clarity of the discussion. If someone doesn't understand (or agree with) classic AI-takeover scenarios, this is one of the posts I'd use to explain them.

Write a Thousand Roads to Rome

Nicholas / Heather Kross1y20

Funny thing, I had a similar idea to this (after reading some Sequences and a bit about pedagogy). That was the sort-of-multi-modal-based intuition behind Mathopedia.

NicholasKross's Shortform

Nicholas / Heather Kross1y40

Is any EA group *funding* adult human intelligence augmentation? It seems broadly useful for lots of cause areas, especially research-bottlenecked ones like AI alignment.

Why hasn't e.g. OpenPhil funded this project?: https://www.lesswrong.com/posts/JEhW3HDMKzekDShva/significantly-enhancing-adult-intelligence-with-gene-editing

3quetzal_rainbow1y

I skimmed the page and haven't found if GeneSmith applied to OpenPhil?..

Does LessWrong make a difference when it comes to AI alignment?

Nicholas / Heather Kross1y20

Seems to usually be good faith. People can still be biased of course (and they can't all be right on the same questions, with the current disagreements), but it really is down to differing intuitions, which background-knowledge posts have been read by which people, etc.

Does LessWrong make a difference when it comes to AI alignment?

Answer by Nicholas / Heather KrossJan 04, 202430

To add onto other people's answers:

People have disagreements over what the key ideas about AI/alignment even are.

People with different basic-intuitions notoriously remain unconvinced by each other's arguments, analogies, and even (the significance of) experiments. This has not been solved yet.

Alignment researchers usually spend most time on their preferred vein of research, rather than trying to convince others.

To (try to) fix this, the community's added concepts like "inferential distance" and "cruxes" to our vocabulary. These should be be discussed and u... (read more)

1PhilosophicalSoul1y

Do you think these disagreements stem from a sort of egoistic desire to be known as the 'owner' of that concept? Or to be a forerunner for that vein of research should it become popular? Or is it a genuinely good faith disagreement on the future of AI and what the best approach is? (Perhaps these questions are outlined in the articles you've linked, which I'll begin reading now. Though I do think it's still useful to perhaps include a summary here too.) Thanks for your help.

Nicholas / Heather Kross1y2-2

I relate to this quite a bit ;-;