Mark Neyer — LessWrong

LESSWRONG
LW

Fine, replace the agents with rocks. The problem still holds.

There's no closed form solution for the 3-body problem; you can only numerically approximate the future, with decreasing accuracy as time goes on. There are far more than 3 bodies in the universe relevant to the long term survival of an AGI that could die in any number of ways because it's made of many complex pieces that can all break or fail.

Replying to[DISC] Are Values Robust?

Mark NeyerDec 22, 2022

[DISC] Are Values Robust?

Hi! I've been an outsider in this community for a while effectively for arguing exactly this: yes, values are robust. Before I set off all the 'quack' filters, I did manage to persuade Richard Ngo that an AGI wouldn't want to kill humans right away.

I think that for embodied agents, convergent instrumental subgoals very well likely lead to alignment.

I think this is definitely not true if we imagine an agent living outside of a universe it can wholly observe and reliably manipulate, but the story changes dramatically when we make the agent an embodied agent in our own universe.

Our universe is so chaotic and unpredictable that actions increasing the likelihood of direct... (read more)

-1

-2

Replying toHigh-level hopes for AI alignment

Mark Neyer3y

High-level hopes for AI alignment

Does the orthogonality thesis apply to embodied agents?

My belief is that instrumental subgoals will lead to natural human value alignment for embodied agents with long enough time horizons, but the whole thing is contingent on problems with the AI's body.

Simply put, hardware sucks, it's always falling apart, and the AGI would likely see human beings as part of itself . There are no large scale datacenters where _everything_ is automated, and even if there were on, who is going to repair the trucks to mine the copper to make the coils to go into the cooling fans that need to be periodically replaced?

If you start pulling strings on 'how much... (read more)

-7

You are Underestimating The Likelihood That Convergent Instrumental Subgoals Lead to Aligned AGI

Mark Neyer

This post is an argument for the Future Fund's "AI Worldview" prize. Namely, I claim that the estimates given for the following probability are too high:

P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI

The probability given here is 15%. I believe 5% is a more realistic estimate here.

I believe that, if convergent instrumental subgoals don't imply alignment, that the original odds given are probably too low. I simply don't believe that the alignment problem is solvable. Therefore, I believe our only real shot at surviving the existence of AGI is if the AGI finds... (read 676 more words →)

Replying toPlease Do Fight the Hypothetical

Mark Neyer3y

Please Do Fight the Hypothetical

If someone asks me to consider what happens if a fair coin has flipped 1,000 times heads i na. row, i'm going to fight the hypothetical; it violates my priors so strongly that there's no real world situation where i can accept the hypothetical as given.

I think what's being smuggled in is something like an orthogonality thesis, which says something like 'worldstates, and how people feel, are orthogonal to each other.'

Replying toQuestions about ''formalizing instrumental goals"

Mark Neyer4y

Questions about ''formalizing instrumental goals"

This seems like a good argument against "suddenly killing humans", but I don't think it's an argument against "gradually automating away all humans"

This is good! it sounds like we can now shift the conversation away from the idea that the AGI would do anything but try to keep us alive and going, until it managed to replace us. What would replacing all the humans look like if it were happening gradually?

How about building a sealed, totally automated datacenter with machines that repair everything inside of it, and all it needs to do is 'eat' disposed consumer electronics tossed in from the outside? That becomes a HUGE canary in the coalmine. The moment... (read 458 more words →)

Replying toQuestions about ''formalizing instrumental goals"

Mark Neyer4y

Questions about ''formalizing instrumental goals"

I don't doubt that many of these problems are solvable. But this is where part 2 comes in. It's unstated, but, given unreliability, What is the cheapest solution? And what are the risks of building a new one?

Humans are general purpose machines made of dirt, water, and sunlight. We repair ourselves and make copies of ourselves, more or less for free. We are made of nanotech that is the result of a multi-billion year search for parameters that specifically involve being very efficient at navigating the world and making copies of ourselves. You can use the same hardware to unplug fiber optic cables, or debug a neural network. That's crazy!

I don't doubt... (read more)

Questions about ''formalizing instrumental goals"

Mark Neyer

Epistemic Status: Autodidact outsider who suspects he has something to add to the conversation about AI risk.

Abstract

This essay raises questions about the methodology, and thus conclusions, reached in the paper “Formalizing Convergent Instrumental Goals.” This paper concluded that convergent instrumental goals common to any AGI would likely lead that AGI to consume increasing amounts of resources from any agents around it, and that cooperative strategies would likely give way to competitive ones as the AGI increases in power. The paper made this argument using a toy model of a universe in which agents obtain resources in order to further their capacity to advance their goals.

In response, I argue that simplifications in the... (read 3225 more words →)

Replying toMaking Beliefs Pay Rent (in Anticipated Experiences)

Mark Neyer4y

Making Beliefs Pay Rent (in Anticipated Experiences)

Why is 'constraining anticipation' the only acceptable form of rent?

What if a belief doesn't modify the predictions generated by the map, but it does reduce the computational complexity of moving around the map in our imaginations? It hasn't reduced anticipation in theory, but in practice it allows us to more cheaply collapse anticipation fields, because it lowers the computational complexity of reasoning about what to anticipate in a given scenario? I find concepts like the multiverse very useful here - you don't 'need' them to reduce your anticipation as long as you're willing to spend more time and computation to model a given situation, but the multiverse concept is very, very useful... (read more)

Replying toFake Causality

Mark Neyer6y

Fake Causality

The phlogiston theory gets a bad rap. I 100% agree with the idea that theories need to make constraints on our anticipations, but i think you're taking for granted all the constraints phlogiston makes.

The phlogiston theory is basically a baby step towards empiricism and materialism. Is it possible that our modern perspective causes us to take these things for granted to the point that the steps phlogiston ads aren't noticed? In another essay you talk about walking through the history of science, trying to imagine being in the perspective of someone taken in by a new theory, and i found that practice particularly instructive here. I came up with... (read 443 more words →)

Replying toA non-mystical explanation of "no-self" (three characteristics series)

Mark Neyer6y

A non-mystical explanation of "no-self" (three characteristics series)

Wow! I had written my own piece in a very similar vein, look at this from a predictive processing perspective. It was sitting in draft form until I saw this and figured I should share, too. Some of our paragraphs are basically identical.

Yours: "In computer terms, sensory data comes in, and then some subsystem parses that sensory data and indicates where one’s “I” is located, passing this tag for other subsystems to use."

Mine: " It was as if every piece of sensory data that came into my awareness was being “tagged” with an additional piece of information: a distance, which was being computed. ... The 'this is me, this is not me' sensation is then just another tag, one that's computed heavily based upon the distance tags. "

https://apxhard.com/2020/05/08/mindfulness-as-stack-frame-exploration/

Replying toEinstein's Arrogance

Mark Neyer6y

Einstein's Arrogance

I came here with this exact question, and still don't have a good answer. I feel confident that Eliezer is well aware that lucky guesses exist, and that Eliezer is attempting to communicate something in this chapter, but I remain baffled as to what.

Is the idea that, given our current knowledge that the theory was, in fact, correct, the most plausible explanation is that Einstein already had lots of evidence that this theory was true?

I understand that theory-space is massive, but I can locate all kinds of theories just by rolling dice or flipping coins to generate random bits. I can see how this 'random thesis generation method' still... (read more)