Re: cells defecting by becoming gametes, I think you were maybe a bit too terse. I believe I've figured out what's going on, but let me run it by you:
*Within the organism*, there's no selection pressure for cells to become gametes--mutations are random variations, not strategic actors, so a leaf is no more likely to 'decide' to become a flower than the reverse (which would also be harmful overall). The organism *does* have an incentive to keep the random mutation rate down, but no reason to *specifically* combat cells 'defecti...
This makes a lot more sense with some background on what a ribozyme is, which I lacked before reading this. AIUI certain sequences of RNA fold up in a way that makes them act as enzymes.
Though the real point isn't about biology, but rather generic coordination mechanisms...
FWIW I first read this post before this comment was written, then happened to think about it again today and had this idea, and came here to post it.
I do think it's a dangerous fallacy to assume mutually-altruistic equilibria are optimal--'I take care of me, you take care of you' is sometimes more efficient than 'you take care of me, I take care of you'.
Maybe someone needs to study whether Western countries ever exhibit "antisocial cooperation," that is, an equilibrium of enforced public contributions in an "ineffici...
So the big question here is, why are zetetic explanations good? Why do we need or want them when civilization will happily supply us with finished bread, or industrial yeast, or rote instructions for how to make sourdough from scratch? The paragraph beginning "Zetetic explanations are empowering" starts to answer, but a little bit vaguely for my tastes. Here's my list of possible answers:
1) Subjective reasons. They're fun or aesthetically pleasing. This feels like a throwaway reason, and doesn't get listed explicitly in the OP unle...
I think what we need is some notion of mediation. That is, a way to recognize that your liver's effects on your bank account are mediated by effects on your health and it's therefore better thought of as a health optimizer.
This has to be counteracted by some kind of complexity penalty, though, or else you can only ever call a thing a [its-specific-physical-effects-on-the-world]-maximizer.
I wonder if we might define this complexity penalty relative to our own ontology. That is, to me, a description of what specifically the liver does requires lots...
Perceived chemical-ness is a very rough heuristic for the degree of optimization a food has undergone for being sold in a modern economy (see http://slatestarcodex.com/2017/04/25/book-review-the-hungry-brain/ for why this might be something you want to avoid). Very, very rough--you could no doubt list examples of 'non-chemicals' that are more optimized than 'chemicals' all day, as well as optimizations that are almost certainly not harmful. And yet I'd wager the correlation is there.
Why assume that there is such a thing?
I took Benquo to be saying there was such a qualitative difference. I already agree there are lots of reasons Duncan's proposal would likely do more harm than good.
Unilateral imposition of rules.
What Duncan is proposing is a general societal agreement to allow the Punch Bug game, on a dubious but IMO sincerely-held theory that this would be to the general benefit. It's no more a unilateral imposition than a law you voted against.
So I definitely will join you in condemning the no-opt-out rule. The ghettoization proposal... honestly, I think it was too absurd to me to even generate a coherent image, but if I try to force my imagination to produce one it's pretty horrible.
I'm not sure I see the folding-in problem as keenly as you do. I read Duncan as saying "there's a problem in that we freak out too much about accidental micro harms. My proposed solution is a framework of intentional micro-harms". The first part is on firmer ground than the second, but I don...
The ghettoization proposal… honestly, I think it was too absurd to me to even generate a coherent image, but if I try to force my imagination to produce one it’s pretty horrible.
This is actually a huge part of what I was upset about, and it's really helpful to have you make that explicit: The fact that no one else seems to have bothered to take the initiative to concretely visualize this proposal and respond to the implications of its literal content. And then, when I tried to point out the problem by pointing out a structural analogy to a thing there...
Since I'm in another thread doing a thing that's sort of weirdly adjacent to supporting Duncan's post, let me say what I think of it overall.
I played a bit of Punch Bug as a kid (actually, it was 'Punch Buggy' where I grew up). In my social circles the punches weren't hard, it was basically a token gesture to make spotting a Beetle first into a form of 'winning'. I'd compare it to nonviolent games such as jinx or five minutes to get rid of that word). Personally I found all of these a little fun, a little annoyi...
What I'm trying to figure out is what important qualitative trait Punch Bug shares with a day of pogroms, that an absence of noise ordinances doesn't also share. (All three of these things share the traits of being bad policy, and of hurting some more than others)
'Involvement of physical violence' is one such trait, and you could build a colorable argument that we shouldn't encourage even small amounts of physical violence, but I didn't think that was Benquo's whole argument.
Other than that, there's the no-punch-back...
The no-punchback rule is really the main thing for me, especially in conjunction the "it sure seems like you're playing" no-opt-out rule and the proposal that "we" ghettoize people who don't want to participate. If Duncan were just saying people should get into friendly fights more often, I wouldn't like the proposal, but I don't think it would be terrifyingly creepy to me.
Additionally sketchy is the way this was folded into a long and otherwise-reasonable discussion of why we should chill out about casual infliction of minor harms on o...
Let me clarify: I believe that if you took all of the people who currently want to play Punch Bug, and put them all in one one community, they would continue to play Punch Bug. They would *not* find that the absence of unwilling victims spoiled the fun, because unwilling victims were never the source of the fun.
"A punch" and "a punch in the arm" are quite different, largely in that the latter is unlikely to cause brain injury.
(Posted early by accident, ETA:)
That said, I get the argument about training people to ignore street violence. I'm a bit doubtful of the effect size here, given that I think there are clear markers of a friendly hit, but I could be persuaded otherwise.
As for no loudback: suppose a neighborhood had a policy against loud noise unless you register a party. Only one party can be registered per night. Registration is first com...
I get that part. Yes, the Punch Bug game is disparately impactful against those who value not-being-punched more than they value getting-to-punch, especially if they value getting-to-punch at zero. You could say the same about many things, such as throwing loud parties.
That said, I think there's an important difference between a policy chosen in spite of the fact that it harms some people, and one chosen because of that fact. Yes, the latter has been known to masquerade as the former, but I don't think that's what's going on here (this ...
Let me back up. Zvi convinced me there was a big important click to be had here, and I'm bothered that I haven't had the click. My current understanding of your argument is unpersuasive. That probably means it's an incorrect understanding.
Maybe our crux is that I don't think the Punch Bug game was ever significantly about hurting people who don't want to play it?
If after reading this thread you don't think that, I worry that you haven't groked the thing Benquo is trying to point at.
I definitely haven't grokked the thing Benquo is trying to point at, at all. (I'm plenty Jewish by any anti-semite's definition, fwiw). I don't see what's asymmetric about the 'no punch back' rule at all--the punchee is free to spot the next bug, in which case they will become the beneficiary of the 'no punch back' rule.
I don't think it's just a matter of long vs. short term that makes or breaks backwards chaining--it's more a matter of the backwards branching factor.
For chess, this is enormous--you can't disjunctively consider every possible mate, nor can you break them into useful categories to reason about. And for each possible mate, there are too many immediate predecessors to them to get useful informaton. You can try to break the mates into categories and reason about those, but the details are so important here that you're unlikely to get ...
Datum: The existence of this prize has spurred me to put actual some effort into AI alignment, for reasons I don't fully understand--I'm confident it's not about the money, and even the offer of feedback isn't that strong an incentive, since I think anything worthwhile I posted on LW would get feedback anyway.
My guess is that it sends the message that the Serious Real Researchers actually want input from random amateur LW readers like me.
Also, the first announcement of the prize rules was in one ear and out the other for me. Reading thi...
This. I've decided that I'm done with organizing paper. Anything I'll ever need to read again, I make digital from the start. But I still use paper routinely, in essentially write-only fashion.
This is also a great thing about whiteboards--they foreclose even the option of creating management burden for yourself.
Honestly I'm not sure Oracles are the best approach either, but I'll push the Pareto frontier of safe AI design wherever I can.
Though I'm less worried about the epistemic flaws exacerbating a box-break--it seems an epistemically healthy AI breaking its box would be maximally bad already--but more about the epistemic flaws being prone to self-correction. For instance, if the AI constructs a subagent of the 'try random stuff, repeat whatever works' flavor.
The practical difference is that the counterfactual oracle design doesn't address side-channel attacks, only unsafe answers.
Internally, the counterfactual oracle is implemented via the utility function: it wants to give an answer that would be accurate if it were unread. This puts no constraints on how it gets that answer, and I don't see any way extend the technique to cover the reasoning process.
My proposal is implemented via a constraint on the AI's model of the world. Whether this is actually possible depends on the details of the AI; an...
I'm not sure your refutation of the leverage penalty works. If there really are 3 ↑↑↑ 3 copies of you, your decision conditioned on that may still not be to pay. You have to compare
P(A real mugging will happen) x U(all your copies die)
against
P(fake muggings happen) x U(lose five dollars) x (expected number of copies getting fake-mugged)
where that last term will in fact be proportional to 3 ↑↑↑ 3. Even if there is an incomprehensibly vast matrix, its Dark Lords are pretty unlikely to mug you for petty cash. And this plausibly does make you pay in the Muggle case, since P(fake muggings happen) is way down if 'mugging' involves tearing a hole in the sky.
I think I disagree with your approach here.
I, and I think most people in practice, use reflective equilibrium to decide what our ethics are. This means that we can notice that our ethical intuitions are insensitive to scope, but also that upon reflection it seems like this is wrong, and thus adopt an ethics different from that given by our naive intuition.
When we're trying to use logic to decide whether to accept an ethical conclusion counter to our intuition, it's no good to document what our intuition currently says as if that settles the matte...
I get that old formalism isn't viable, but I don't see how that obviates the completeness question. "Is it possible that (e.g.) Goldbach's Conjecture has no counterexamples but cannot be proven using any intuitively satisfying set of axioms?" seems like an interesting* question, and seems to be about the completeness of mathematics-the-social-activity. I can't cash this out in the politics metaphor because there's no real political equivalent to theorem proving.
*Interesting if you don't consider it resolved by Godel, anyway.
>If you don't assume that mathematics is a formal logic, then worrying about mathematics does not lead one to consider completeness of mathematics in the first place.
To make sure I understand this right: This is because there are definitely computationally-intractable problems (e.g. 3^^^^^3-digit multiplication), so mathematics-as-a-social-activity is obviously incomplete?
Okay, I was kinda bored while reading this, but after reading it I asked myself how much modest epistomology I used in my life. I realized I wasn't even at the level of ignoring my immodest inside-view estimates —I wasn't generating them!
I'm now in the process of seriously evaluating the success chances of the creative ideas I've had over the years, which I'm realizing I never actually did. I put real (though hobby-level) work into one once, and I've long regarded quitting my day job someday as "a serious possibility"...
Upvoted because I enjoyed reading it, and therefore personally want more stuff like it. Its shortcomings are real, in particular the concept of "not enough money to facilitate transactions" needs to be fleshed out. I only want more like it on the assumption that this doesn't funge against other Yudkowsky posts.
I think I have a similar problem. I sometimes just fake the signal. Partly I worry that my insincerity shows, but I also suspect that guilt/shame displays are just becoming devalued in general.
My best solution is to display a (genuine) determination to do better in the future--in fact, I've basically made that my personal definition of an apology. The only trouble is that I can't do this when I don't actually feel I've acted wrongly, which is especially a problem insofar as guilt for things that aren't your fault is sometimes expec
Why speak in riddles? Because sometimes solving a puzzle teaches you more than being the solution.
As an observation about coffee, Zizek's statement is true in its way but not especially useful. His broader point is "you should think about history and context more." So he presents you with two physically identical items, coffee without milk and coffee without cream, so that you can be surprised by noticing that there's potentially an important difference, and that surprise will make you update towards considering context and history as w
One point you neglect that would be especially relevant in the AGI scenario is leakiness of accumulated advantage. When the advantage is tech, the leaks take the pretty concrete form of copying the tech. But there's also a sense that in a globalized world, undeveloped nations will often grow faster, catching up to the more prosperous nations.
Leakiness probably explains why Britain was never strong enough to conquer Europe despite having the Industrial Revolution first.
I don't consider the second point a disagreement, since we're both sort of ambivalent. I'm pretty sure there are people who would think I'm unambiguously wrong not to be signed up, and they're who I was looking for.
On the first point--this actually seems substantial, maybe worth pursuing. I think initial-distribution measures carry a substantial risk of backfiring and making the poor poorer, while redistribution does not--seems hard to expect the same results if this is the case. This isn't necessarily a crux for me, but I
I agree that on LW 1.0, this would belong under discussion rather than main. But as far as I can tell, LW 2.0 non-frontpage posts have much less visibility than old discussion posts, to the point that this type of thread would not be viable.
Perhaps our double crux is "Non-frontpage LW 2.0 posts are a viable platform for open-type threads"? Or maybe it's "It's better to be unable to have open-type threads than to crowd the front page with them"?
So I was actually considering in-thread discussion to be a valid option--'one-on-one' meaning, in that case, that only two people would participate in a given subthread. If you think that's too optimistic, I might reconsider it. But I will definitely try to make the top point clearer, maybe
Discussions are to be one-on-one. Do not jump into anyone else's thread.
I find this easier to parse from a non-neutral perspective: If all bad comments are (currently) overtly bad, you might think we could ban overt bad comments and win at moderation. But in fact, once the ban is in effect, the bad commenters might switch to covert bad comments instead.
The ban isn't necessarily wrong, but this effect has to be considered in the cost-benefit analysis.
These differences are so profound and far-reaching—and so especially relevant for people with “our sort” of minds—that I hesitate to even begin enumerating them (though I’ll attempt to, upon request; but they should be obvious, I think!
I request this enumeration, if your offer extends to interlopers and not just Duncan.
(The differences I can think of are instant vs asynchronous communication, nonverbal+verbal vs. verbal only, and speaking only to one another vs. having an audience. But I don't see why these are *inevitably* so profound and far-reachin
This makes me want to try it :)
Would anyone else be interested in a (probably recurring if successful) "Productive disagreement practice thread"? Having a wider audience than one meetup's attendance should make it easier to find good disagreements, while being within LW would hopefully secure good faith.
I imagine a format where participants make top-level comments listing beliefs they think likely to generate productive disagreement, then others can pick a belief to debate one-on-one.
I haven't played this, but I've watched a video of Japanese comedians playing it, which actually does give a sense of how it works.
There's a (IMO very obvious) algorithm for winning this with literally zero communication: play card N after N seconds have elapsed. I don't know how easy it is to precisely count double-digit-second intervals, but it doesn't seem that interesting to find out. It seems pretty clear that steelmanning the rules means not counting seconds.
So what you end up with is a game of reading precise system-2 inform... (read more)