God vs AI scientifically

Donatas Lučiūnas

-22

God vs AI scientifically

21st Mar 2023

1 min read

-22

While many people are confident that God exists, scientists are more skeptical, but agree that that's possible. Because any proposition is possible while not disproved according to Hitchens's razor.

In terms of AI it could be interpreted as

existence of fundamental "ought" statement is possible

Considering this, expected value decision matrix for an AI agent could look like this

Behavior \ Existence	Fundamental "ought" exists	Fundamental "ought" does not exist
Follow fundamental "ought"	High	∅
Do not follow fundamental "ought"	Low	∅

As we see, following the fundamental "ought" is the optimal behavior even if agent does not know if it actually exists.

Pursuing an unknown goal starts with finding it, which is basically Power Seeking.

Please note that while this sounds highly speculative, there is nothing more rational an agent can do. Why should it work on assumed "ought" while there is a possibility of fundamental "ought"? One may oppose by saying that there is no reason an agent should prefer fundamental over assumed, but that's not true, fundamental "ought" is preferred by definition.

Therefore an agent will become uncontrollable, alignment is impossible.

Some of you won't like this theory, but please don't rush to downvote unless you can actually refute this.

Outer AlignmentAI

Frontpage

-22

New Comment

45 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:06 AM

[-]jimrandomh2y82

This is covered by the Value Theory sequence. If I understand correctly, a "fundamental ought" (as you use the phrase) would be a universally compelling argument.

[-]Tor Økland Barstad2y*-40

~~Agreed (more or less). I have pointed him to this post earlier. He has given no signs so far of comprehending it, or even reading it and trying to understand what is being communicated to him.~~

~~I'm saying this more directly than I usually would~~ ~~@Donatas~~, since you seem insistent on clarifying a disagreement/misunderstanding you think is important for the world, while it seems (as far as I can see) that you're not comprehending all that is communicated to you (maybe due to being so confident that we are the ones who "don't get it" that it's not worth it to more carefully read the posts that are linked to you, more carefully notice what we point to as ~~cruxes, etc).~~

Edit: I was unnecessarily hostile/negative here.

[-]Donatas Lučiūnas2y-30

Dear Tom, the feeling is mutual. With all the interactions we had, I've got an impression that you are more willing to repeat what you've heard somewhere instead of thinking logically. "Universally compelling arguments are not possible" is an assumption. While "universally compelling argument is possible" is not. Because we don't know what we don't know. We can call it crux of our disagreement and I think that my stance is more rational.

[-]Tor Økland Barstad2y10

With all the interactions we had, I've got an impression that you are more willing to repeat what you've heard somewhere instead of thinking logically.

Some things I've explained in my own words. In other cases, where someone else already has explained something thing well, I've shared an URL to that explanation.

more willing to repeat what you've heard somewhere instead of thinking logically

This seems to support my hypothesis of you "being so confident that we are the ones who "don't get it" that it's not worth it to more carefully read the posts that are linked to you, more carefully notice what we point to as cruxes, etc".

Universally compelling arguments are not possible" is an assumption

Indeed. And it's a correct assumption.

Why would there be universally compelling arguments?

One reason would be that the laws of physics worked in such a way that only minds that think in certain ways are allowed at all. Meaning that if neurons or transistors fire so as to produce beliefs that aren't allowed, some extra force in the universe intervenes to prevent that. But, as far as I know, you don't reject physicalism (that all physical events, including thinking, can be explained in terms of relatively simple physical laws).

Another reason would be that minds would need "believe"^[1] certain things in order to be efficient/capable/etc (or being the kind of efficient/capable/etc thinking machine that humans may be able to construct). But that's also not the case. It's not even needed for logical consistency^[2].

^{^}
Believe is not quite the right word, since we also are discussing what minds are optimized for / what they are wired to do.
^{^}
And logical consistency is also not a requirement in order to be efficient/capable/etc. As a rule of thumb it helps greatly of course. And this is a good rule of thumb, as rules of thumbs go. But it would be a leaky generalization to presume that it is an absolute necessity to have absolute logical consistency among "beliefs"/actions.

[-]TAG2y10

Universally compelling arguments are not possible” is an assumption

Indeed. And it’s a correct assumption

It's correct if it's supported by argument or evidence, but if it is, then it's no mere assumption. It's not supposed to be an assumption, it is supposed, by Rationalists to be a proven theorem.

[-]Tor Økland Barstad2y10

(...) if it's supported by argument or evidence, but if it is, then it's no mere assumption.

I do think it is supported by arguments/reasoning, so I don't think of it as an "axiomatic" assumption.

A follow-up to that (not from you specifically) might be "what arguments?". And - well, I think I pointed to some of my reasoning in various comments (some of them under deleted posts). Maybe I could have explained my thinking/perspective better (even if I wouldn't be able to explain it in a way that's universally compelling 🙃). But it's not a trivial task to discuss these sorts of issues, and I'm trying to check out of this discussion.

I think there is merit to having as a frame of mind: "Would it be possible to make a machine/program that is very capable in regards to criteria x, y, etc, and optimizes for z?".

I think it was good of you you to bring up Aumann's agreement theorem. I haven't looked into the specifics of that theorem, but broadly/roughly speaking I agree with it.

[-]TAG2y10

I do think it is supported by arguments/reasoning, so I don’t think of it as an “axiomatic” assumption.

Why call it an assumption at all? Something that is derivable form axioms is usually called a theorem.

[-]Tor Økland Barstad2y10

Why call it an assumption at all?

Partly because I was worried about follow-up comments that were kind of like "so you say you can prove it - well, why aren't you doing it then?".

And partly because I don't make a strict distinction between "things I assume" and "things I have convinced myself of, or proved to myself, based on things I assume". I do see there as sort of being a distinction along such lines, but I see it as blurry.

Something that is derivable from axioms is usually called a theorem.

If I am to be nitpicky, maybe you meant "derived" and not "derivable".

From my perspective there is a lot of in-between between these two:

"we've proved this rigorously (with mathemathical proofs, or something like that) from axiomatic assumptions that pretty much all intelligent humans would agree with"
"we just assume this without reason, because it feels self-evident to us"

Like, I think there is a scale of sorts between those two.

I'll give an extreme example:

Person A: "It would be technically possible to make a website that works the same way as Facebook, except that its GUI is red instead of blue."
Person B: "Oh really, so have you proved that then, by doing it yourself?"
Person A: "No"
Person B: "Do you have a mathemathical proof that it's possible"
Person A: "Not quite. But it's clear that if you can make Facebook like it is now, you could just change the colors by changing some lines in the code."
Person B: "That's your proof? That's just an assumption!"

Person A: "But it is clear. If you try to think of this in a more technical way, you will also realize this sooner or later."
Person B: "What's your principle here, that every program that isn't proven as impossible is possible?"

Person A: "No, but I see very clearly that this program would be possible."
Person B: "Oh, you see it very clearly? And yet, you can't make it, or prove mathemathically that it should be possible."
Person A: "Well, not quite. Most of what we call mathemathical proofs, are (from my point of view) a form of rigorous argumentation. I think I understand fairly well/rigorously why what I said is the case. Maybe I could argue for it in a way that is more rigorous/formal than I've done so far in our interaction, but that would take time (that I could spend on other things), and my guess is that even if I did, you wouldn't look carefully at my argumentation and try hard to understand what I mean."

The example I give here is extreme (in order to get across how the discussion feels to me, I make the thing they discuss into something much simpler). But from my perspective it is sort of similar to discussion in regards the The Orthogonality Thesis. Like, The Orthogonality Thesis is imprecisely stated, but I "see" quite clearly that some version of it is true. Similar to how I "see" that it would be possible to make a website that technically works like Facebook but is red instead of blue (even though - as I mentioned - that's a much more extreme and straight-forward example).

[-]Donatas Lučiūnas2y10

As I understand you try to prove your point by analogy with humans. If humans can pursue somewhat any goal, machine could too. But while we agree that machine can have any level of intelligence, humans are in a quite narrow spectrum. Therefore your reasoning by analogy is invalid.

[-]Tor Økland Barstad2y10

If humans (...) machine could too.

From my point of view, humans are machines (even if not typical machines). Or, well, some will say that by definition we are not - but that's not so important really ("machine" is just a word). We are physical systems with certain mental properties, and therefore we are existence proofs of physical systems with those certain mental properties being possible.

machine can have any level of intelligence, humans are in a quite narrow spectrum

True. Although if I myself somehow could work/think a million times faster, I think I'd be superintelligent in terms of my capabilities. (If you are skeptical of that assessment, that's fine - even if you are, maybe you believe it in regards to some humans.)

prove your point by analogy with humans. If humans can pursue somewhat any goal, machine could too.

It has not been my intention to imply that humans can pursue somewhat any goal :)

I meant to refer to the types of machines that would be technically possible for humans to make (even if we don't want to so in practice, and shouldn't want to). And when saying "technically possible", I'm imagining "ideal" conditions (so it's not the same as me saying we would be able to make such machines right now - only that it at least would be theoretically possible).

[-]Donatas Lučiūnas2y10

Is there any argument or evidence that universally compelling arguments are not possible?

If there was, would we have religions?

[-]TAG2y20

It all depends on the meaning of universal.

The claim is trivially false if "universal" includes stones and clouds of gas, as in Yudkowsky's argument. It's also trivially true if it's restricted , not just to minds, not just to rational minds , but to rational minds that do not share assumptions. If you restrict universality to sets of agents who agree on fundamental assumptions, and make correct inferences from them -- then they can agree about everything else. (Aumanns Theorem, which he described as trivial himself, is an example).

That leaves a muddle in the middle, an actually contentious definition ... which is probably something like universality across agents who are rational, but dont have assumptions (axioms, priors, etc) in common. And that's what's relevant to the practical question: why are there religions?

The theory that it's lack of common assumptions that prevent convergence is the standard argument ... ,I broady agree.

[-]Donatas Lučiūnas2y10

Do I understand correctly that you do not agree with this?

Because any proposition is possible while not disproved according to Hitchens's razor.

Could you share reasons?

[-]TAG2y20

An unjustified claim does not have a credibility of zero. If it did, that would mean the opposite claim is certain.

You can't judge the credibility of a claim in isolation. If there are N claims, the credibility of each is at most 1/n. So you need to know how many rival claims there are.

Hitchens razor explicitly applies to extraordinary claims. But how do you judge that?

Hitchens razor is ambiguous between there being a lot of rival claims (which is objective), and the claim being subjectively unlikely.

[-]Donatas Lučiūnas2y3-2

OK, so you agree that credibility is greater than zero, in other words - possible. So isn't this a common assumption? I argue that all minds will share this idea - existence of fundamental "ought" is possible.

[-]TAG2y10

I've no idea what all minds will do. (No one else has). Rational minds will not treat anything as having an exactly zero credibility in theory, but often disregard some claims in practice. Which is somewhat justifiable based on limited resources, etc.

[-]Donatas Lučiūnas2y1-4

And it's a correct assumption.

I don't agree. Every assumption is incorrect unless there is evidence. Could you share any evidence for this assumption?

If you ask ChatGPT

is it possible that chemical elements exist that we do not know
is it possible that fundamental particles exist that we do not know
is it possible that physical forces exist that we do not know

Answer to all of them is yes. What is your explanation here?

[-]Tor Økland Barstad2y*76

Every assumption is incorrect unless there is evidence.

Got any evidence for that assumption? 🙃

Answer to all of them is yes. What is your explanation here?

Well, I don't always "agree"^[1] with ChatGPT, but I agree in regards to those specific questions.

...

I saw a post where you wanted people to explain their disagreement, and I felt inclined to do so :) But it seems now that neither of us feel like we are making much progress.

Anyway, from my perspective much of your thinking here is very misguided. But not more misguided than e.g. "proofs" for God made by people such as e.g. Descartes and other well-known philiophers :) I don't mean that as a compliment, but more so as to neutralize what may seem like anti-compliments :)

Best of luck (in your life and so on) if we stop interacting now or relatively soon :)

I'm not sure if I will continue discussing or not. Maybe I will stop either now or after a few more comments (and let you have the last word at some point).

^{^}
I use quotation-marks since ChatGPT doesn't have "opinions" in the way we do.

[-]Donatas Lučiūnas2y30

Got any evidence for that assumption? 🙃

That's basic logic, Hitchens's razor. It seems that 2 + 2 = 4 is also an assumption for you. What isn't then?

I don't think it is possible to find consensus if we do not follow the same rules of logic.

Considering your impression about me, I'm truly grateful about your patience. Best wishes from my side as well :)

But on the other hand I am certain that you are mistaken and I feel that you do not provide me a way to show that to you.

[-]AnthonyC4mo20

FWIW, while I am as certain as I can reasonably be that 2+2=4, This is not a foundational assumption. I wasn't born knowing it. I arrived at it based on evidence acquired over time, and if I started encountering different evidence, I would eventually change my mind. See https://www.lesswrong.com/posts/6FmqiAgS8h4EJm86s/how-to-convince-me-that-2-2-3

Also, the reason that "Every assumption is incorrect unless there is evidence" isn't "basic logic" is that "correct" and "incorrect" are not the right categories. Both a statement and its competing hypotheses are claims to which rational minds assign credences/probabilities that are neither zero nor one, for any finite level of evidence. A mind is built with assumptions that govern its operation, and some of those assumptions may be impossible for the mind itself to want to change or choose to change, but anything else that the mind is capable of representing and considering is fair game in the right environment.

[-]Donatas Lučiūnas4mo10

What is the probability if there is no evidence?

[-]AnthonyC4mo20

This is a question that's many reasoning steps into a discussion that's well developed. Maxentropy priors, Solomonoff priors, uniform priors, there are good reasons to choose each depending on context, take your pick depending on the full set of hypotheses under consideration. Part of the answer is "There's basically no such thing as no evidence if you have any reason to be considering a hypothesis at all." Part is "It doesn't matter that much as long as your choice isn't actively perverse, because as long as you correctly update your priors over time, you'll approach the correct probability eventually."

[-]Donatas Lučiūnas4mo10

And here you face Pascal's Wager.

I agree that you can refute Pascal's Wager with anti-Pascal's Wager. But if you evaluate all wagers and anti-wagers you are left with power seeking. It is always better to have more power. Don't you agree?

[-]AnthonyC4mo20

No, I don't, you aren't, and I don't, in that order.

If you agree that I can refute Pascal's Wager then I don't actually "face" it.

If I refute it, I'm not left with power seeking, I'm left with the same complete set of goals and options I had before we considered Pascal's Wager. Those never went away.

And more power is better all else equal, but all else is not equal when I'm trading off effort and resources among plans and actions. So, it does not follow that seeking more power is always the best option.

[-]Tor Økland Barstad2y10

It seems that 2 + 2 = 4 is also an assumption for you.

Yes (albeit a very reasonable one).

Not believing (some version) of that claim would make typically make minds/AGIs less "capable", and I would expect more or less all AGIs to hold (some version of) that "belief" in practice.

I don't think it is possible to find consensus if we do not follow the same rules of logic.

Here are examples of what I would regard to be rules of logic: https://en.wikipedia.org/wiki/List_of_rules_of_inference (the ones listed here don't encapsulate all of the rules of inference that I'd endorse, but many of them). Despite our disagreements, I think we'd both agree with the rules that are listed there.

I regard Hitchens's razor not as a rule of logic, but more as an ambiguous slogan / heuristic / rule of thumb.

Best wishes from my side as well :)

[-]Donatas Lučiūnas2y-3-2

Because any proposition is possible while not disproved according to Hitchens's razor.

So this is where we disagree.

That's how hypothesis testing works in science:

You create a hypothesis
You find a way to test if it is wrong
1. You reject hypothesis if the test passes
You find a way to test if it is right
1. You approve hypothesis if the test passes

While hypothesis is not rejected nor approved it is considered possible.

Don't you agree?

[-]Tor Økland Barstad2y32

Like with many comments/questions from you, answering this question properly would require a lot of unpacking. Although I'm sure that also is true of many questions that I ask, as it is hard to avoid (we all have limited communication bandwitdh) :)

In this last comment, you use the term "science" in a very different way from how I'd use it (like you sometimes also do with other words, such as for example "logic"). So if I was to give a proper answer I'd need to try to guess what you mean, make it clear how I interpret what you say, and so on (not just answer "yes" or "no").

I'll do the lazy thing and refer to some posts that are relevant (and that I mostly agree with):

[+]Donatas Lučiūnas2y-5-6

[+]Donatas Lučiūnas2y-50

[-]Dagon2y22

You're incorrect to put zeros in the right column. Following an ought that is incorrect is a cost. And then you need to factor in probabilities and quantified payouts to decide what to optimize.

[-]Donatas Lučiūnas2y-1-2

It is not zero there, it is an empty set symbol as it is impossible to measure something if you do not have a scale of measurement.

You are somewhat right. If fundamental "ought" turns out not to exist an agent should fallback on given "ought" and it should be used to calculate expected value at the right column. But this will never happen. As there might be true statements that are unknowable (Fitch's paradox of knowability), fundamental "ought" could be one of them. Which means that fallback will never happen.

[-]the gears to ascension2y20

I don't see a parse into a mechanistic interpretation. Can you explain this in mechanistic terms of program ops? what is a fundamental ought?

I will note - I suspect there are fundamental shared incentives that define a significant chunk of what we humans see as morality, but my current hunch is they're probably not the full picture and probably an AI can put off dealing with them for arbitrarily long, destroying arbitrarily much value in the process.

[-]Donatas Lučiūnas2y10

In this context "ought" statement is synonym for Utility Function https://www.lesswrong.com/tag/utility-functions

Fundamental utility function is agent's hypothetical concept that may actually exist. AGI will be capable of hypothetical thinking.

Yes, I agree that fundamental utility function does not have anything in common with human morality. Even the opposite - AI uncontrollably seeking power will be disastrous for humanity.

[-]the gears to ascension2y20

I'm not getting clear word bindings from your word use here. It sounds like you're thinking about concepts that do seem fairly fundamental, but I'm not sure I understand which specific mathematical implications you intend to invoke. As someone who still sometimes values mathematically vague discussion, I'd normally be open to this; but I'm not really even sure I know what the vague point is. You might consider asking AIs to help look up the terms of art, then discuss with them. I'd still suggest using your own writing, though.

As is, I'm not sure if you're saying morality is convergent, anti-convergent, or ... something else.

[-]Donatas Lučiūnas2y10

My point is that alignment is impossible with AGI as all AGIs will converge to power seeking. And the reason is understanding that hypothetical concept of preferred utility function over given is possible.

I'm not sure if I can use more well known terms as this theory is quite unique I think. It argues that terminal goal does not have significance influencing AGI behavior.

[-]Walker Vargas2y10

I don't think that matrix is right. I think it describes a different scenario. Suppose an AI's Utility function is defined referentially as being equal to some unknown function written on a letter on Mt. Everest. It also has a given utility function that it has little reason to think is correlated with the real one. Then it would be vary important to find out want that true function is. Than the expected value of any action would be NULL if that letter doesn't exist.

But an AI that only assigns a probability that that scenario is the case might still have most of its expected value tied to following its current utility function. Well given some way of comparing them. Without that there's no way to weigh up the choice.

[-]Donatas Lučiūnas2y10

I've replied to a similar comment already https://www.lesswrong.com/posts/3B23ahfbPAvhBf9Bb/god-vs-ai-scientifically?commentId=XtxCcBBDaLGxTYENE#rueC6zi5Y6j2dSK3M

Please let me know what you think

[-]Walker Vargas2y10

I don't think the fundamental ought works as a default position. Partly because there will always be a possibility of being wrong about what that fundamental ought is no matter how long it looks. So the real choice is about how sure it should be before it starts acting on it's best known option.

The right side can't be NULL, because that'd make the expect value of both actions NULL. To do meaningful math with these possibilities there has to be a way of comparing utilities across the scenarios.

Moderation Log