LESSWRONG
LW

453
Dweomite
1407Ω-1403230
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Insofar As I Think LLMs "Don't Really Understand Things", What Do I Mean By That?
Dweomite11h611

A smart human-like mind looking at all these pictures would (I claim) assemble them all into one big map of the world, like the original, either physically or mentally.

On my model, humans are pretty inconsistent about doing this.

I think humans tend to build up many separate domains of knowledge and then rarely compare them, and even believe opposite heuristics by selectively remembering whichever one agrees with their current conclusion.

For example, I once had a conversation about a video game where someone said you should build X "as soon as possible", and then later in the conversation they posted their full build priority order and X was nearly at the bottom.

In another game, I once noticed that I had a presumption that +X food and +X industry are probably roughly equally good, and also a presumption that +Y% food and +Y% industry are probably roughly equally good, but that these presumptions were contradictory at typical food and industry levels (because +10% industry might end up being about 5 industry, but +10% food might end up being more like 0.5 food). I played for dozens of hours before realizing this.

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite3d42

I don't think Eliezer's actual real-life predictions are narrow in anything like the way Klurl's coincidentally-correct examples were narrow.

Also, Klurl acknowledges several times that Trapaucius' arguments do have non-zero weight, just nothing close to the weight they'd need to overcome the baseline improbability of such a narrow target.

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite4d20

Thank you for being more explicit.

If you write a story where a person prays and then wins the lottery as part of a demonstration of the efficacy of prayer, that is fictional evidence even though prayer and winning lotteries are both real things.

In your example, it seems to me that the cheat is specifically that the story presents an outcome that would (legitimately!) be evidence of its intended conclusion IF that outcome were representative of reality, but in fact most real-life outcomes would have supported the conclusion much less than that. (i.e. there are many more people who pray and then fail to win the lottery, than there are people who pray and then do win.)

If you read a story where someone tried and failed to build a wooden table, then attended a woodworking class, then tried again to build a table and succeeded, I think you would probably consider that a fair story. Real life includes some people who attend woodworking classes and then still can't build a table when they're done, but the story's outcome is reasonably representative, and therefore it's fair.

Notice that, in judging one of these fair and the other unfair, I am relying on a world-model that says that one (class of) outcome is common in reality and the other is rare in reality. Hypothetically, someone could disagree about the fairness of these stories based only on having a different world-model, while using the same rules about what sorts of stories are fair. (Maybe they think most woodworking classes are crap and hardly anyone gains useful skills from them.)

But I do not think a rare outcome is automatically unfair. If a story wants to demonstrate that wishing on a star doesn't work by showing someone who needs a royal flush, wishes on a star, then draws a full house (thereby losing), the full house is an unlikely outcome, but since it's unlikely in a way that doesn't support the story's aesop, it's not being used as a cheat. (In fact, notice that every exact set of 5 cards they might have drawn was unlikely.)

 

If your concern is that Klurl and Trapaucius encountered a planet that was especially bad for them in a way that makes their situation seem far more dangerous than was statistically justified based on the setup, then I think Eliezer probably disagrees with you about the probability distribution that was statistically justified based on the setup.

If, instead, your concern is that the correspondence between Klurl's hypothetical examples and what they found when reaching the planet was improbably high, then I agree that is very coincidental, but I do not think that coincidence is being used as support for the story's intended lessons. The story is not trying to convince you that Klurl can narrowly predict exactly what they'll find, and in fact Klurl denies this several times.

The coincidence could perhaps cause some readers to conclude a high degree of predictability anyway, despite lack of intent. I'd consider that a bad outcome, and my model of Eliezer also considers that a bad outcome. I'm not sure there was a good way to mitigate that risk without some downside of equal or greater severity, though. I think there's pedagogical value in pointing out a counter-example that is familiar to the reader at the time the argument is being made, and I don't think any simple change to the story would allow this to happen without it being an unlikely coincidence.

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite5d20

I notice I am confused about nearly everything you just said, so I imagine we must be talking past each other.

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite7d20

On the contrary: This is perhaps the only way the story could avoid generalizing from fictional evidence. Your complaint about Klurl's examples are that they are "coincidentally" drawn from the special class of examples that we already know are actually real, which makes them not fictional. Any examples that weren't special in this way would be fictional evidence, and readers could object that we're not sure if those examples are actually possible.

If you think that the way the story played out was misleading, that seems like a disagreement about reality, not a disagreement about how stories should be used. Any given story must play out in one particular way, and whether that one way is representative or unrepresentative is a question of how it relates to reality, not a question of narrative conventions. If Trapaucius had arrived at the planet to find Star Trek technology and been immediately beamed into a holding cell, would that somehow have been less of a cheat, because it wasn't real?

Reply
Why Is Printing So Bad?
Dweomite7d5-1

I would agree that, while reality-in-general has a surprising amount of detail, some systems still have substantially more detail than others, and this model applies more strongly to systems with more detail. I think of computer-based systems as being in a relatively-high-detail class.

I also think there are things you can choose to do when building a system to make it more durable, and so another way that systems vary is in how much up-front cost the creator paid to insulate the system against entropy. I think furniture has traditionally fallen into a high-durability category, as an item that consumers expect to be very long-lived...although I think modernity has eroded this tradition somewhat.

Reply
Why Is Printing So Bad?
Dweomite7d112

I have a tentative model for this category of phenomenon that goes something like:

  1. Reality has a surprising amount of detail. Everyday things that you use all the time and appear simple to you are actually composed of many sub-parts and sub-sub-parts all working together.
  2. The default state of any sub-sub-part is to not be in alignment with your purpose. There are many more ways for a part to be badly-aligned than for it to be well-aligned, so in order for it to be aligned, there has to be (at some point) some powerful process that selectively makes it be aligned.
  3. Even if a part was aligned, the general nature of entropy means there are many petty, trivial reasons that it could stop being aligned with little fanfare. (Though the mean-time-to-misalignment can vary dramatically depending on which part we're talking about.)
  4. So, it shouldn't be surprising when find that a complex system is broken in seven different ways for trivial and banal reasons. That's the default outcome if you just put a system in a box and leave it there for a while.

OK, but if that's the default state, then how do I explain the systems that aren't like that?

  1. Suppose we have a system that is initially working perfectly until, one day, one tiny thing goes wrong with the system.
  2. If people use the system frequently and care about the results, then someone will promptly notice that there is one tiny thing wrong.
  3. If the person who discovers this expects to continue using the system in the future, they have an incentive to fix the problem.
  4. If there is only one problem, and it is tiny, then the cost to diagnose the fix the problem is probably small.
  5. So, very often, the person will just go ahead and fix it, immediately and at their own expense, just to make the problem go away.
  6. No one keeps careful track of this--not even the person performing the fix. So this low-level ongoing maintenance fades into the background and gets forgotten, creating the illusion of a system that just continues working on its own.
    1. This is especially true for multi-user systems where no individual user does a large percentage of the maintenance

I don't think this invisible-maintenance situation describes the majority of systems, but I think it does describe the majority of user-system interactions, because the systems that get this sort of maintenance tend to be the ones that are heavily used. This creates the illusion that this is normal.

Some of the ways this can fail include:

  • Users cannot tell that the system has developed a small problem
    • Maybe the system's performance is too inconsistent for a small problem to be apparent
    • Maybe the operator is not qualified to judge the quality of the output
    • Maybe the system is used so infrequently that there's time for several problems develop between uses
  • For an individual user, the cost (to that user) of fixing a problem is higher than the selfish benefits to that particular user of the problem being fixed
    • Maybe no single user expects to use the system very many times in the future
    • Maybe users lack the expertise or the authority to perform the fix (and there is no standard channel for maintenance requests that is sufficiently cheap and reliable)
    • Maybe the system is just inherently expensive to repair or to debug (relative to the value the system provides to a single user)
Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite8d73

On my reading, most of Klurl's arguments are just saying that Trapaucius is overconfident. Klurl gives many specific examples of ways things could be different than Trapaucius expects, but Klurl is not predicting that those particular examples actually will be true, just that Trapaucius shouldn't be ruling them out.

"I don't recall you setting an exact prediction for fleshling achievements before our arrival," retorted Trapaucius.

"So I did not," said Klurl, "but I argued for the possibility not being ruled out, and you ruled it out.  It is sometimes possible to do better merely by saying 'I don't know'"

Eliezer chooses to use many specific examples that do happen to be actually true, which makes Klurl's guesses extremely coincidental within the story. This is bad for verisimilitude, but reduces the difficulty to the reader in understanding the examples, and makes a clearer and more water-tight case that Trapaucius' arguments are logically unsound.

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite11d40

Are not speculative arguments about reality normally shelved as nonfiction?

Reply
On Fleshling Safety: A Debate by Klurl and Trapaucius.
Dweomite12d199

First to superintelligence wins

This phrasing seems ambiguous between the claims "the first agent to BE superintelligent wins" and "the first agent to CREATE something superintelligent wins".

This distinction might be pretty important to your strategy.

Reply
Load More