Wiki Contributions

Comments

(I guess I appreciate being thought of but it does seem like somewhat undermining your point to tag people who haven't used the site in checks seven-almost-eight years.)

Where's that 'should' coming from? (Or are you just explaining the concept rather than endorsing it?)

This was basically my answer - I can't play as an AI using this strategy, for obvious reasons, but an AI that used its one sentence to give a novel and easily-testable solution to a longstanding social problem of some sort (or an easily-testable principle that suggests one or more novel solutions) would probably get at least a second sentence from me (though not a typed response; that seems to open up a risky channel). Especially if the AI in question didn't actually have access to a lot of information about human culture or me personally and had to infer that a solution like that would be useful from near-base principles - that's not proof of Friendliness, but an AI using its one guaranteed communication to do something that has a decent chance of improving the world per our definition without any prompting whatsoever sure looks suspiciously like Friendly to me.

That parses as 'do not let others conduct experiments'. Probably not what you're aiming for.

If you have the resources to put something at the south pole, you probably have the resources to scatter a couple dozen stonehenges/pyramids/giant stone heads around; then you don't have to specify unambiguously, plus redundancy is always good.

I think it's a failed utopia because it involves the AI modifying the humans' desires wholesale - the fact that it does so by proxy doesn't change that it's doing that.

(This may not be the only reason it's a failed utopia.)

Actually, it's my bad - I found your comment via the new-comments list, and didn't look very closely at its context.

As to your actual question: Being told that someone has evidence of something is, if they're trustworthy, not just evidence of the thing, but also evidence of what other evidence exists. For example, in my scenario with gwern's prank, before I've seen gwern's web page, I expect that if I look the mentioned drug up in other places, I'll also see evidence that it's awesome. If I actually go look the drug up and find out that it's no better than placebo in any situation, that's also surprising new information that changes my beliefs - the same change that seeing gwern's "April Fools" message would cause, in fact, so when I do see that message, it doesn't surprise me or change my opinion of the drug.

In your scenario, I trust Merck's spokesperson much less than I trust gwern, so I don't end up with nearly so strong of a belief that third parties will agree that the drug is a good one - looking it up and finding out that it has dangerous side effects wouldn't be surprising, so I should take the chance of that into account to begin with, even if the Merck spokesperson doesn't mention it. This habit of keeping possible information from third parties (or information that could be discovered in other ways besides talking to third parties, but that the person you're speaking to wouldn't tell you even if they'd discovered it) into account when talking to untrustworthy people is the intended lesson of the original post.

Someone claiming that they have evidence for a thing is already evidence for a thing, if you trust them at all, so you can update on that, and then revise that update on how good the evidence turns out to be once you actually get it.

For example, say gwern posts to Discussion that he has a new article on his website about some drug, and he says "tl;dr: It's pretty awesome" but doesn't give any details, and when you follow the link to the site you get an error and can't see the page. gwern's put together a few articles now about drugs, and they're usually well-researched and impressive, so it's pretty safe to assume that if he says a drug is awesome, it is, even if that's the only evidence you have. This is a belief about both the drug (it is particularly effective at what it's supposed to do) and what you'll see when you're able to access the page about it (there will be many citations of research indicating that the drug is particularly effective).

Now, say a couple days later you get the page to load, and what it actually says is "ha ha, April Fools!". This is new information, and as such it changes your beliefs - in particular, your belief that the drug is any good goes down substantially, and any future cases of gwern posting about an 'awesome' drug don't make you believe as strongly that the drug is good - the chance that it's good if there is an actual page about it stays about the same, but now you also have to factor in the chance that it's another prank - or in other words that the evidence you'll be given will be much worse than is being claimed.

It's harder to work out an example of evidence turning out to be much stronger than is claimed, but it works on the same principle - knowing that there's evidence at all means you can update about as much as you would for an average piece of evidence from that source, and then when you learn that the evidence is much better, you update again based on how much better it is.

What would a ritual that's just about rationality and more complex than a group recitation of the Litany of Tarsky look like?

non-zero engineering resources

effectively zero

Getting someone to sort a list, even on an ongoing basis, is not functionally useful if there's nobody to take action on the sorted list.

Load More