Decaeneus - LessWrong

Absence of evidence is the dark matter of inference. It's invisible yet it's paramount to good judgement.

It's easy to judge X to be true if you see some evidence that could only come about if X were true. It's a lot more subtle to judge X to be false if you do see some evidence that it's true, but you can also determine that there are lots of evidence that you would expect to have if it were true, but that is missing.

In a formalized setting like a RCT this is not an issue, but when reasoning in the wild, this is the norm. I'm guessing this leads to a bias of too many false positives on any issue where you care to look deeply enough to find and cherry pick the positive evidence.

EDIT: Correcting the opening sentence to say "absence of evidence" rather than the original "negative evidence".

Decaeneus's Shortform

Decaeneus3d10

Feels like there's a missing deep learning paradigm that is the equivalent of the human "go for a walk and think about stuff, absent new stimuli". There are some existing approaches that hint at this (generative replay / dreaming) but those feel a bit different than my subjective sense that I'm "working things out" when I go for a walk, rather than generatively dreaming, as I do at night.

Relatedly: it reduces my overall cognitive output when I go through periods of depriving myself of these idle periods by jamming them full of stimuli (e.g. podcasts). I do better when I bite the bullet, accept the boredom, and allow my mind to work out whatever it needs to.

Speculatively, perhaps the two are related: overstimulated ML researchers have a blind spot to the fact that understimulating a neural net might be helpful.

Decaeneus's Shortform

Decaeneus18d50

There’s a particular type of cognitive failure that I reliably experience, which seems like a pure kind of misconfiguration of the mind, and which I've found very difficult to will myself to not experience, which feels like some kind of fundamental limitation.

The quickest way to illustrate this is with an example: I'm playing a puzzle game that requires ordering 8 letters into a word, and I'm totally stuck. As soon as I look at a hint of what the first letter is, I can instantly find the word.

This seems wrong. In theory, I expect I can just iterate through each of the 8 letters, tell myself that's the first one, and then look for the word, and move on if the word isn't there. If it took me 2 seconds to “see the word” for a given starting letter (so, not quite insta-, budgeting some context switching time) then I ought to be able to reliably find the word within 16 seconds. Yet, I find that I can't do it even with a much greater time budget. It's almost like the "not being sure" if each particular letter I'm iterating through is actually the starting one reduces my mental horsepower and prevents me from earnestly trying to find the word, almost like I expect to feel some mental pain from whole-heartedly looking for something that may not be there.

So I'm truly stuck, and then when I get the hint, I'm insta-unstuck. I've had the same feeling when solving other puzzles or even math problems: I (with high confidence) know the superset from which the beginning of the solution is drawn, yet iterating through the superset does not make the solution pop to mind. However, once I get a hint that tells me *for sure* what the beginning of the solution is, I can solve the rest of it.

What gives? How to fix? Does this have implications outside of puzzles, in the more open-ended world of e.g. looking to make new friends, or to find a great business model?

(Putting this into Grok tells me it doesn’t have a name but it’s a commonly known thing in the puzzle-solving community, of which I’m not part.)

Decaeneus's Shortform

Decaeneus5mo4-10

Proposal: if you're a social media or other content based platform, add a long-press to the "share" button which allows you to choose between "hate share" and "love share".

Therefore:
* quick tap: keep the current functionality, you get to send the link wherever / copy to clipboard
* long press and swipe to either hate or love share: you still get to send the link (optionally, the URL has some argument indicating it's a hate / love share, if the link is a redirect through the social media platform)

This would allow users to separate out between things that are worth sharing but that they hate / love and want to see less / more of, and it might defang the currently powerful strategy (with massive negative social externalities) of generating outrage content just to get more shares.

Social media companies can, in turn, then use this to dial back the viraility of hate share vs love share content, if they choose to do so.

Decaeneus's Shortform

Decaeneus10mo10

You're right, this is not a morality-specific phenomenon. I think there's a general formulation of this that just has to do with signaling, though I haven't fully worked out the idea yet.

For example, if in a given interaction it's important for your interlocutor to believe that you're a human and not a bot, and you have something to lose if they are skeptical of your humanity, then there's lots of negative externalities that come from the Internet being filled with indistinguishable-from-human chatbots, irrespective its morality.

Decaeneus's Shortform

Decaeneus10mo10

Since you marked as a crux the fragment "absent acceleration they are likely to die some time over the next 40ish years" I wanted to share two possibly relevant Metaculus questions. Both of these seem to suggest numbers longer than your estimates (and these are presumably inclusive of the potential impacts of AGI/TAI and ASI, so these don't have the "absent acceleration" caveat).

Decaeneus's Shortform

Decaeneus10mo10

OK, agreed that this depends on your views of whether cryonics will work in your lifetime, and of "baseline" AGI/ASI timelines absent your finger on the scale. As you noted, it also depends on the delta between p(doom while accelerating) and baseline p(doom).

I'm guessing there's a decent number of people who think current (and near future) cryonics don't work, and that ASI is further away than 3-7 years (to use your range). Certainly the world mostly isn't behaving as if it believed ASI was 3-7 years away, which might be a total failure of people acting on their beliefs, or it may just reflect that their beliefs are for further out numbers.

Decaeneus's Shortform

Decaeneus10mo11

Simple math suggests that anybody who is selfish should be very supportive of acceleration towards ASI even for high values of p(doom).

Suppose somebody over the age of 50 thinks that p(doom) is on the order of 50%, and that they are totally selfish. It seems rational for them to support acceleration, since absent acceleration they are likely to die some time over the next 40ish years (since it's improbable we'll have life extension tech in time) but if we successfully accelerate to ASI, there's a 1-p(doom) shot at an abundant and happy eternity.

Possibly some form of this extends beyond total selfishness.

Self-censoring on AI x-risk discussions?

Decaeneus10mo10

So, if your ideas have potential important upside, and no obvious large downside, please share them.

What would be some examples of obviously large downside? Something that comes to mind is anything that tips the current scales in a bad way, like some novel research result that directs researchers to more rapid capabilities increase without a commensurate increase in alignemnt. Anything else?

Decaeneus's Shortform

Decaeneus10mo10

Immorality has negative externalities which are diffuse, and hard to count, but quite possibly worse than its direct effects.

Take the example of Alice lying to Bob about something, to her benefit and his detriment. I will call the effects of the lie on Alice and Bob direct, and the effects on everybody else externalities. Concretely, the negative externalities here are that Bob is, on the margin, going to trust others in the future less for having been lied to by Alice than he would if Alice has been truthful. So in all of Bob's future interactions, his truthful counterparties will have to work extra hard to prove that they are truthful, and maybe in some cases there are potentially beneficial deals that simply won't occur due to Bob's suspicions and his trying to avoid being betrayed.

This extra work that Bob's future counterparties have to put in, as well as the lost value from missed deals, add up to a meaningful cost. This may extend beyond Bob, since everyone else who finds out that Bob was lied to by Alice will update their priors in the same direction as Bob, creating second order costs. What's more, since everyone now thinks their counterparties suspect them of lying (marginally more), the reputational cost of doing so drops (because they already feel like they're considered to be partially liars, so the cost of confirming that is less than if they felt they were seen as totally truthful) and as a result everyone might actually be more likely to lie.

So there's a cost of deteriorating social trust, of p*ssing in the pool of social commons.

One consequence that seems to flow from this, and which I personally find morally counter-intuitive, and don't actually believe, but cannot logically dismiss, is that if you're going to lie you have a moral obligation to not get found out. This way, the damage of your lie is at least limited to its direct effects.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments