player_03 - LessWrong

Foom seems unlikely in the current LLM training paradigm

~~Gary Marcus~~ Yann LeCun describes LLMs as "an off-ramp on the road to AGI," and I'm inclined to agree. LLMs themselves aren't likely to "turn AGI." Each generation of LLMs demonstrates the same fundamental flaws, even as they get better at hiding them.

But I also completely buy the "FOOM even without superintelligence" angle, as well as the argument that they'll speed up AI research by an unpredictable amount.

The Orthogonality Thesis is Not Obviously True

player_032y30

I agree that it can take a long time to prove simple things. But my claim is that one has to be very stupid to think 1+1=3

Or one might be working from different axioms. I don't know what axioms, and I'd look at you funny until you explained, but I can't rule it out. It's possible (though implausible given its length) that Principia Mathematica wasn't thorough enough, that it snuck in a hidden axiom that - if challenged - would reveal an equally-coherent alternate counting system in which 1+1=3.

I brought up Euclid's postulates as an example of a time this actually happened. It seems obvious that "two lines that are parallel to the same line are also parallel to each other," but in fact it only holds in Euclidean geometry. To quote the Wikipedia article on the topic,

Many other statements equivalent to the parallel postulate have been suggested, some of them appearing at first to be unrelated to parallelism, and some seeming so self-evident that they were unconsciously assumed by people who claimed to have proven the parallel postulate from Euclid's other postulates.

"So self-evident that they were unconsciously assumed." But it turned out, you can't prove the parallel postulate (or any equivalent postulate) from first principles, and there were a number of equally-coherent geometries waiting to be discovered once we started questioning it.

My advice is to be equally skeptical of claims of absolute morality. I agree you can derive human morality if you assume that sentience is good, happiness is good, and so on. And maybe you can derive those from each other, or from some other axioms, but at some point your moral system does have axioms. An intelligent being that didn't start from these axioms could likely derive a coherent moral system that went against most of what humans consider good.

Summary: you're speculating, based on your experience as an intelligent human, that an intelligent non-human would deduce a human-like moral system. I'm speculating that it might not. The problem is, neither of us can exactly test this at the moment. The only human-level intelligences we could ask are also human, meaning they have human values and biases baked in.

We all accept similar axioms, but does that really mean those axioms are the only option?

The Orthogonality Thesis is Not Obviously True

player_032y21

In other words, when I say that "Murder is bad," that is a fact about the world, as true as 2+2=4 or the Pythagorean theorem.

I like this way of putting it.

In Principia Mathematica, Whitehead and Russell spent over 300 pages laying groundwork before they even attempt to prove 1+1=2. Among other things, they needed to define numbers (especially the numbers 1 and 2), equality, and addition.

I do think "1+1=2" is an obvious fact. If someone claimed to be intelligent and also said that 1+1=3, I'd look at them funny and press for clarification. Given all the assumptions about how numbers work I've absorbed over the course of my life, I'd find it hard to conceive of anything else.

Likewise, I find it hard to conceive of any alternative to "murder is bad," because over the course of my life I've absorbed a lot of assumptions about the value of sentient life. But the fact that I've absorbed these assumptions doesn't mean every intelligent entity would agree with them.

In this analogy, the assumptions underpinning human morality are like Euclid's postulates. They seem so obvious that you might just take them for granted, as the only possible self-consistent system. But we could have missed something, and one of them might not be the only option, and there might be other self-consistent geometries/moralities out there. (The difference being that in the former case M.C. Escher uses it to make cool art, and in the latter case an alien or AI does something we consider evil.)

What was my mistake evaluating risk in this situation?

player_034y10

Back when you were underestimating Covid, how much did you hear from epidemiologists? Either directly or filtered through media coverage?

I was going to give an answer about how "taking the outside view" should work, but I realized I needed this information first.

ELI12: how do libertarians want wages to work?

player_034y70

I don't think it invalidates the claim that "Without the minimum wage law, lots of people would probably be paid significantly less." (I believe that's one of the claims you were referring to. Let me know if I misinterpreted your post.)

I don't have a whole lot of time to research economies around the world, but I checked out a couple sources with varying perspectives (two struck me as neutral, two as libertarian). One of the libertarian ones made no effort to understand or explain the phenomenon, but all three others agreed that these countries rely on strong unions to keep wages up.

IMO, that means you're both partially right. As you said, some countries can and do function without minimum wages - it's clearly possible. But as the original poster said, if a country has minimum wage laws, removing those laws will in fact tend to reduce wages.

Some countries without minimum wages still have well-paid workers. Other countries without minimum wages have sweatshops. I think that market forces push towards the "sweatshop" end of the scale (for the reasons described by the original poster), and unions are one of the biggest things pushing back.

The ethics of breeding to kill

player_035y200

Most of the research is aware of that limitation. Either they address it directly, or the experiment is designed to work around it, assuming mental state based on actions just as you suggest.

My point here isn't necessarily that you're wrong, but that you can make a stronger point by acknowledging and addressing the existing literature. Explain why you've settled on suicidal behavior as the best available indicator, as opposed to vocalizations and mannerisms.

This is important because, as gbear605 pointed out, most farms restrict animals' ability to attempt suicide. If suicide attempts are your main criterion, that seems likely to skew your results. (The same is true of several other obvious indicators of dissatisfaction, such as escape attempts.)

The ethics of breeding to kill

player_035y30

I'm afraid I don't have time to write out my own views on this topic, but I think it's important to note that several researchers have looked into the question of whether animals experience emotion. I think your post would be a lot stronger if you addressed and/or cited some of this research.

A way to beat superrational/EDT agents?

player_035y10

I do want to add - separately - that superrational agents (not sure about EDT) can solve this problem in a roundabout way.

Imagine if some prankster erased the "1" and "2" from the signs in rooms A1 and A2, leaving just "A" in both cases. Now everyone has less information and makes better decisions. And in the real contest, (super)rational agents could achieve the same effect by keeping their eyes closed. Simply say "tails," maximize expected value, and leave the room never knowing which one it was.

None of which should be necessary. (Super)rational agents should win even after looking at the sign. They should be able to eliminate a possibility and still guess "tails." A flaw must exist somewhere in the argument for "heads," and even if I haven't found that flaw, a perfect logician would spot it no problem.

A way to beat superrational/EDT agents?

player_035y10

Oh right, I see where you're coming from. When I said "you can't control their vote" I was missing the point, because as far superrational agents are concerned, they do control each other's votes. And in that case, it sure seems like they'll go for the $2, earning less money overall.

It occurs to me that if team 4 didn't exist, but teams 1-3 were still equally likely, then "heads" actually would be the better option. If everyone guesses "heads," two teams are right, and they take home $4. If everyone guesses "tails," team 3 takes home $3 and that's it. On average, this maximizes winnings.

Except this isn't the same situation at all. With group 4 eliminated from the get go, the remaining teams can do even better than $4 or $3. Teammates in room A2 knows for a fact that the coin landed heads, and they automatically earn $1. Teammates in room A1 are no longer responsible for their teammates' decisions, so they go for the $3. Thus teams 1 and 2 both take home $1 while team 3 takes home $3, for a total of $5.

Maybe that's the difference. Even if you know for a fact that you aren't on team 4, you also aren't in a world where team 4 was eliminated from the start. The team still needs to factor into your calculations... somehow. Maybe it means your teammate isn't really making the same decision you are? But it's perfectly symmetrical information. Maybe you don't get to eliminate team 4 unless your teammate does? But the proof is right in front of you. Maybe the information isn't symmetrical because your teammate could be in room B?

I don't know. I feel like there's an answer in here somewhere, but I've spent several hours on this post and I have other things to do today.

A way to beat superrational/EDT agents?

Answer by player_03Aug 17, 2020-10

I'm going to rephrase this using as many integers as possible because humans are better at reasoning about those. I know I personally am.

Instead of randomness, we have four teams that perform this experiment. Teams 1 and 2 represent the first flip landing on heads. Team 3 is tails then heads, and team 4 is tails then tails. No one knows which team they've been assigned to.

Also, instead of earning $1 or $3 for both participants, a correct guess earns that same amount once. They still share finances so this shouldn't affect anyone's reasoning; I just don't want to have to double it.

Team 1 makes 2 guesses. Each "heads" guess earns $1, each "tails" guess earns nothing.

Team 2 makes 2 guesses. Each "heads" guess earns $1, each "tails" guess earns nothing.

Team 3 makes 1 guess. Guessing "heads" earns nothing, guessing "tails" earns $3.

Team 4 makes 1 guess. Guessing "heads" earns nothing, guessing "tails" earns $3.

If absolutely everyone guesses "heads," teams 1 and 2 will earn $4 between them. If absolutely everyone guesses "tails," teams 3 and 4 will earn $6 between them. So far, this matches up.

Now let's look at how many people were sent to each room.

Three people visit room A1: one from team 1, one from team 2, and one from team 3. 2/3 of them are there because the first "flip" was heads.

Three people visit room A2: one from team 1, one from team 2, and one from team 4. 2/3 of them are there because the first "flip" was heads.

Two people visit room B: one from team 3 and one from team 4. They don't matter.

The three visitors to A1 know they aren't on team 4, thus they can subtract that team's entire winnings from their calculations, leaving $4 vs. $3.

The three visitors to A2 know they aren't on team 3, thus they can subtract that team's entire winnings from their calculations, leaving $4 vs. $3.

Do you see the error? Took me a bit.

If you're in room A1, you need to subtract more than just team 4's winnings. You need to subtract half of team 1 and team 2's winnings. Teams 1 and 2 each have someone in room A2, and you can't control their vote. Thus:

Three people visit room A1: one from team 1, one from team 2, and one from team 3. If all three guess "heads" they earn $2 in all. If all three guess "tails" they earn $3 in all.

Three people visit room A1: one from team 1, one from team 2, and one from team 4. If all three guess "heads" they earn $2 in all. If all three guess "tails" they earn $3 in all.

Guessing "tails" remains the best way to maximize expected value.

---

The lesson here isn't so much to do with EDT agents, it's to do with humans and probabilities. I didn't write this post because I'm amazing and you're a bad math student, I wrote this post because without it, I wouldn't have been able to figure it out either.

Whenever this sort of thing comes up, try to rephrase the problem. Instead of 85%, imagine 100 people in a room, with 85 on the left and 15 on the right. Instead of truly random experiments, imagine the many-worlds interpretation, where each outcome is guaranteed to come up in a different branch. (And try to have an integer number of branches, each representing an equal fraction.) Or use multiple teams like I did above.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments