All of Vaughn Papenhausen's Comments + Replies

Okay, I see better now where you're coming from and how you're thinking that social science could be hopeless and yet we can still build a cooperation machine. I still suspect you'll need some innovations in social science to implement such a machine. Even if we assume that we have a black box machine that does what you say, you still have to be sure that people will use the machine, so you'll need enough understanding of social science to either predict that they will, or somehow get them to.

But even if you solve the problem of implementation, I suspect y... (read more)

I like and agree with a lot in this essay. But I have to admit I'm confused by your conclusion. You dismiss social science research as probably not going anywhere, but then your positive proposal is basically more social science research. Doesn't building "a Cooperation Machine that takes in atomized people and raw intelligence and produces mutual understanding and harmonious collective action" require being "better at organizing our social life in accordance with human flourishing than the Victorians, the Qing, or the pre-conquest Lakota" in exactly the way you claim social science is trying and failing to produce?

8Ivan Vendrov
You're right the conclusion is quite underspecified - how exactly do we build such a cooperation machine? I don't know yet, but my bet is more on engineering, product design, and infrastructure than on social science. More like building a better Reddit or Uber (or supporting infrastructure layers like WWW and the Internet) than like writing papers.
1Mo Putera
That's not the sense I get from skimming his second most recent post, but I don't understand what he's getting at well enough to speak in his place.

Yup. This is how I learned German: found some music I liked and learned to sing it. I haven't learned much Japanese, but there's a bunch of songs I can sing (and know the basic meaning of) even though I couldn't have a basic conversation or use any of those words in other contexts

1Shoshannah Tekofsky
I was low-key imagining you speaking German like Rammstein and then Japanese like Baby Metal. My inner comedian not withstanding, that sounds awesome! _

To my knowledge I am not dyslexic. If I correctly understand what subvocalizing is (reading via your inner monologue), I do it by default unless I explicitly turn it off. I don't remember how I learned to turn it off, but I remember it was a specific skill I had to learn. And I usually don't turn it off because reading without subvocalizing 1. Takes effort, 2. It's less enjoyable, and 3. Makes it harder for me to understand and retain what I'm reading. I generally only turn it off when I have a specific reason why I have to read quickly, e.g. for a school assignment or reading group that I've run low on time to do.

EDIT: replied to wrong comment. Curse you mobile interface!

I suspect this is getting downvoted because it is so short and underdeveloped. I think the fundamental point here is worth making though. I've used the existence proof argument in the past, and I think there is something to it, but I think the point being made here is basically right. It might be worth writing another post about this that goes into a bit more detail.

This is pretty similar in concept to the conlang toki pona, which is a language explicitly designed to be as simple as possible. It has less than 150 words. ("toki pona" means something like "good language" or "good speech" in toki pona)

Quoting a recent conversation between Aryeh Englander and Eliezer Yudkowsky

Out of curiosity, is this conversation publicly posted anywhere? I didn't see a link.

6Aryeh Englander
The conversation took place in the comments section to something I posted on Facebook: https://m.facebook.com/story.php?story_fbid=pfbid0qE1PYd3ijhUXVFc9omdjnfEKBX4VNqj528eDULzoYSj34keUbUk624UwbeM4nMyNl&id=100010608396052&mibextid=Nif5oz

Putting RamblinDash's point another way: when Eliezer says "unlimited retries", he's not talking about a Groundhog Day style reset. He's just talking about the mundane thing where, when you're trying to fix a car engine or something, you try one fix, and if it doesn't start, you try another fix, and if it still doesn't start, you try another fix, and so on. So the scenario Eliezer is imagining is this: we have 50 years. Year 1, we build an AI, and it kills 1 million people. We shut it off. Year 2, we fix the AI. We turn it back on, it kills another million... (read more)

-3johnlawrenceaspden
I like your phrasing better, but I think it just hides some magic. In this situation I think we get an AI that repeatedly kills 999,999 people. It's just the nearest unblocked path problem. The exact reset/restart/turn it off and try again condition matters, and nothing works unless the reset condition is 'that isn't going to do something we approve of'. The only sense I can make of the idea is 'If we already had a friendly AI to protect us while we played, we could work out how to build a friendly AI'.  I don't think we could iterate to a good outcome, even if we had magic powers of iteration. Your version makes it strictly harder than the 'Groundhog Day with Memories Intact' version. And I don't think we could solve that version.

Am I the only one who, upon reading the title, pictured 5 people sitting behind OP all at the same time?

2Andrew Vlahos
Not me. However, I thought of that part in Dr. Seuss where someone watches a bee to make it more productive, someone watches that watcher to make him more productive, someone watches him and so on.
4Yoav Ravid
I did too :)
gwern467

Knowing how supervision scales sounds important to me. Can we get some scaling laws going here for productivity? I need to know the dollar-optimal scaling of worker/supervisor/proximity; it may be more Chinchilla-optimal to hire 20 remote workers to occasionally screenshare instead of 1 in-person person.

3Perhaps
There's also The Work Gym and Pentathalon from Ultraworking.

My model of gears to ascension, based on their first 2 posts, is that they're not complaining about the length for their own sake, but rather for the sake of people that they link this post to who then bounce off because it looks too long. A basics post shouldn't have the property that someone with zero context is likely to bounce off it, and I think gears to ascension is saying that the nominal length (reflected in the "43 minutes") is likely to have the effect of making people who get linked to this post bounce off it, even though the length for practical purposes is much shorter.

2Duncan Sabien (Deactivated)
Yes, agreed. I think that people who are actually going to link this to someone with zero context are going to say "just look at the bulleted list" and that's going to 100% solve the problem for 90% of the people. I think that the set of people who bounce for the reason of "deterred by the stated length and didn't read the first paragraph to catch the context" but who would otherwise have gotten value out of my writing is very very very very very small, and wrong to optimize for. I separately think that the world in general and LW in particular already bend farther over backwards than is optimal to reach out to what I think of in my brain as "the tl;dr crowd." I'm default skeptical of "but you could reach these people better if you X;" I already kinda don't want to reach them and am not making plans which depend upon them.

Pinker has a book about writing called The Sense of Style

There seems to be a conflict between putting “self-displays on social media” in the ritual box, and putting “all social signalling” outside it. Surely the former is a subset of the latter.

My understanding was that the point was this: not all social signalling is ritual. Some of it is, some of it isn't. The point was: someone might think OP is claiming that all social signalling is ritual, and OP wanted to dispel that impression. This is consistent with some social signalling counting as ritual.

I think the idea is to be able to transform this:

- item 1
    - item 2
- item 3

into this:

- item 3
- item 1
    - item 2

I.e. it would treat bulleted lists like trees, and allow you to move entire sub-branches of trees around as single units.

1Johannes C. Mayer
Yes, exactly.

This isn't necessarily a criticism, but "exploration & recombination" and "tetrising" seem in tension with each other. E&R is all about allowing yourself to explore broadly, not limiting yourself to spending your time only on the narrow thing you're "trying to work on." Tetrising, on the other hand, is precisely about spending your time only on that narrow thing.

As I said, this isn't a criticism; this post is about a grab bag of techniques that might work at different times for different people, not a single unified strategy, but it's still interesting to point out the tension here.

2Adam Zerner
Sure thing!

I think the point was that it's a cause you don't have to be a longtermist in order to care about. Saying it's a "longtermist cause" can be interpreted either as saying that there are strong reasons for caring about it if you're a longtermist, or that there are not strong reasons for caring about it if you're not a longtermist. OP is disagreeing with the second of these (i.e. OP thinks there are strong reasons for caring about AI risk completely apart from longtermism).

2TekhneMakre
The whole point of EA is to be effective by analyzing the likely effects of actions. It's in the name. OP writes:   I don't think one shouldn't follow one's virtue ethics, but I note that deontology / virtue ethics, on a consequentialist view, are good for when you don't have clear models of things and ability to compare possible actions. E.g. you're supposed to not murder people because you should know perfectly well that people who conclude they should murder people are mistaken empirically; so you should know that you don't actually have a clear analysis of things. So as I said, there's lots of reasons, such as virtue ethics, to want to work on AI risk. But the OP explicitly mentioned "longtermist cause" in the context of introducing AI risk as an EA cause; in terms of the consequentialist reasoning, longtermism is highly relevant! If you cared about your friends and family in addition to yourself, but didn't care about your hypothetical future great-grandchildren and didn't believe that your friends and family have a major stake in the long future, then it still wouldn't be appealing to work on, right? If by "virtue ethics" the OP means "because I also care about other people", to me that seems like a consequentialist thing, and it might be useful for the OP to know that their behavior is actually consequentialist! 

Not a programmer, but I think one other reason for this is that at least in certain languages (I think interpreted languages, e.g. Python, is the relevant category here), you have to define a term before you can use it; the interpreter basically executes the code top-down instead of compiling it first, so it can't just look later in the file to figure out what you mean. So

def brushTeeth():
    putToothpasteOnToothbrush()
    ...

def putToothpasteOnToothbrush():
    ...


wouldn't work, because you're calling putToothpasteOnToothbrush() before you've defined it.

2Adam Zerner
Yeah sometimes that is a problem for sure. In various situations it's not a problem though. There's a thing called hoisting where if you have: sayHi("Adam"); function sayHi(name) { console.log(`Hi ${name}`); } sayHi will be moved to the top of the file above sayHi("Adam") when the code is executed, even though it is written below it. The other way I know of that this dilemma is solved is closures. Imagine that we had a makeFreshPasta.js file with this: const makeFreshPasta = () => { makeDough(); cutDoughIntoPastaShapes(); boilPastaShapes(); }; const makeDough = () => { ... }; const cutDoughIntoPastaShapes = () => { ... }; const boilPastaShapes = () => { ... }; export default makeFreshPasta; The goal of the file is to export a makeFreshPasta function. It's ok that the makeFreshPasta function uses functions like makeDough that are defined below it because of how closures work. Basically, the function body of makeFreshPasta will always have access to anything that was in scope at the time makeFreshPasta was defined. I'm not sure about other languages, but I'm sure there are other solutions available for writing code that follows the top-down style. So that is my response to the practical question of how to write code that is top-down. But I am also making a point about how I think things should be. So even if it weren't possible to write this sort of top-down code, my prescriptive point of "this is how it should be possible to write the code" still stands.

Fyi, the link to your site is broken for those viewing on greaterwrong.com; it's interpreting "--a" as part of the link.

[This comment is no longer endorsed by its author]Reply

Maybe have a special "announcements" section on the frontpage?

The way I like to think about this is that the set of all possible thoughts is like a space that can be carved up into little territories and each of those territories marked with a word to give it a name.

Probably better to say something like "set of all possible concepts." Words denote concepts, complete sentences denote thoughts.

I'm curious if you're explicitly influenced by Quine for the final section, or if the resemblance is just coincidental.

Also, about that final section, you say that "words are grounded in our direct experience of what happens w... (read more)

2Gordon Seidoh Worley
Thanks for the suggestions! Wasn't specifically thinking of Quine here, but probably have some influence. My influences are actually more the likes of Heidegger, but philosophy seems to converge when it's on the right tack.

Master: Now, is Foucault’s work the content you’re looking for, or merely a pointer.

Student: What… does that mean?

Master: Do you think that you think that the value of Foucault for you comes from the specific ideas he had, or in using him to even consider these two topics?

This put words to a feeling I've had a lot. Often I have some ideas, and use thinkers as a kind of handle to point to the ideas in my head (especially when I haven't actually read the thinkers yet). The problem is that this fools me into thinking that the ideas are developed, eit... (read more)

2adamShimi
Glad that I managed to capture this feeling then! And thanks for the reference! I know of conceptual engineering and genealogy, but didn't know about the book. :)

Yep, check out the Republic, I believe this is in book 5, or if it's not in book 5 it's in book 6.

Answer by Vaughn Papenhausen40

The received wisdom in this community is that modifying one's utility function is at least usually irrational. The classic source here is Steve Omohundro's 2008 paper, "The Basic AI Drives," and Nick Bostrom gives basically the same argument in Superintelligence, pp. 132-34. The argument is basically this: imagine you have an AI that is solely maximizing the number of paperclips that exist. Obviously, if it abandons that goal, there will be less paperclips than if it maintains that goal. And if it adds another goal, say maximizing staples, then this other ... (read more)

I would think the metatheological fact you want to be realist about is something like "there is a fact of the matter about whether the God of Christianity exists." "The God of Christianity doesn't exist" strikes me as an object-level theological fact.

The metaethical nihilist usually makes the cut at claims that entail the existence of normative properties. That is, "pleasure is not good" is not a normative fact, as long as it isn't read to entail that pleasure is bad. "Pleasure is not good" does not by itself entail the existence of any normative property.

Really? I'm American and it sounds perfectly normal to me.

2bfinn
Is it a common phrase in its own right, as it is here in the UK? Maybe it's regional; my partner, from Chicago, didn't recognise it, though she got the literal gist. (ADDED) Actually I see some dictionaries list it (also e.g. 'well-earned rest') as a US phrase as well as UK: https://www.macmillandictionary.com/dictionary/american/well-earned https://www.merriam-webster.com/dictionary/well-earned Though from googling places it's used, I get the impression it's mostly British. I think it has a nice, cozy emotion to it - like awarding yourself a prize each time you take a break (even after 3 minutes' work!) I find it hard to say 'well-earned break' without smiling!

I think this post is extremely interesting, and on a very important topic. As I said elsethread, for this reason, I don't think it should be in negative karma territory (and have strong-upvoted to try to counterbalance that).

On the object level, while there's a frame of mind I can get into where I can see how this looks plausible to someone, I'm inclined to think that this post is more of a reductio of some set of unstated assumptions that lead to its conclusion, rather than a compelling argument for that conclusion. I don't have the time right now to thin... (read more)

I agree with this as well. I have strongly upvoted in an attempt to counterbalance this, but even so it is still in negative karma territory, which I don't think it deserves.

Well if we've fallen to the level of influencing other people's votes by directly stating what the votes ought to say (ugh =/), then let me argue the opposite: This post – at least in its current state – should not have a positive rating.

I agree that the topic is interesting and important, but – as written – this could well be an example of what an AI with a twisted/incomplete understanding of suffering, entropy, and a bunch of other things has come up with. The text conjures several hells, both explicitly (Billions of years of suffering are the right choi... (read more)

A possible example of research film-study in a very literal sense: Andy Matuschak's 2020-05-04 Note-writing livestream.

I would love it if more people did this sort of thing.

I think if you accept the premise that the machine somehow magically truly simulates perfectly and indistinguishably from actual reality, in such a way that there is absolutely no way of knowing the difference between the simulation and the outside universe, then the simulated universe is essentially isomorphic to reality, and we should be fully indifferent. I’m not sure it even makes sense to say either universe is more “real”, since they’re literally identical in every way that matters (for the differences we can’t observe even in theory, I appeal to Ne

... (read more)
1leogao
I guess in that case I think what I'm doing is identifying the experience machine objection as being implied by Newton's flaming lazer sword, which I have far stronger convictions on. For those who reject NFLS, then I guess my argument doesn't really apply. However, at least I personally was in the category of people who firmly accept NFLS and also had reservations about the experience machine, so I don't think this implication is trivial. As for the Andy and Bob situation, I think that objections like that can be similarly dissolved, given an acceptance of NFLS. If Bob has literally absolutely no way of finding out whether his wife and children truly love him, if they act exactly in the way they would if they really did, then I would argue that whether or not they "really" love him is equally irrelevant by NFLS. Our intuitions in this case are guided by the fact that in reality, Potemkin villages almost always eventually fall apart.
1samshap
I think the Bob example is very informative! I think there's an intuitive and logical reason why we think Bob and Edward are worse off. Their happiness is contingent on the masquerade continuing, which has a probability less than one in any plausible setup. (The only exception to this would be if we're analyzing their lives after they are dead)
1TAG
Upvoted from 0. Why was it downvoted?
Answer by Vaughn Papenhausen10

Not sure if this is exactly what you're looking for, but you could check out "Do Now" on the play store: https://play.google.com/store/apps/details?id=com.slamtastic.donow.app (no idea if it's available for apple or not)

Answer by Vaughn Papenhausen10

Two things I've come across. Haven't used either much, but figured I'd mention them:

Ah, I think the fact that there's an image after the first point is causing the numbered list to be numbered 1,1,2,3.

My main concern with using an app like Evergreen Notes is that a hobby project built by one person seems like a fragile place to leave a part of my brain.

In that case you might like obsidian.md.

I found this one particularly impressive: https://m.youtube.com/watch?v=AHiu-EDJUx0

The use of "oops" at the end is spot on.

4ChristianKl
Yes, that video seems very unlikely to happen without the dog having some idea of what the buttons it presses mean.

Hmm. I think this is closer to "general optimizer" than to "optimizer": notice that certain chess-playing algorithms (namely, those that have been "hard-coded" with lots of chess-specific heuristics and maybe an opening manual) wouldn't meet this definition, since it's not easy to change them to play e.g. checkers or backgammon or Go. Was this intentional (do you think that this style of chess program doesn't count as an optimizer)? I think your definition is getting at something interesting, but I think it's more specific than "optimizer".

2Chantiel
Sorry for the late response. If a chess program still has a planning or search algorithm, then I think it would still be helpful for describing an optimizer for something else. For example, suppose a chess program uses a standard planning algorithm and and has added chest-specific heuristics, a chess world model, and goals. Then if you wanted to specify a something-else-optimizer, you could change most of the things but keep the planning algorithm. To count as an optimizer, an optimizer for one thing doesn't need to be easily turned into an optimizer for something else. But it needs to help. It's possible that there is a way to construct what should be called an optimization algorithm that has no generalizeability at all, but I'm not sure how to do that.

I really liked this. I thought the little graphics were a nice touch. And the idea is one of those ones that seems almost obvious in retrospect, but wasn't obvious at all before reading the post. Looking back I can see hints of it in thoughts I've had before, but that's not the same as having had the idea. And the handle ("point of easy progress") is memorable, and probably makes the concept more actionable (it's much easier to plan a project if you can have thoughts like "can I structure this in such a way that there is a point of easy progress, and that I will hit it within a short enough amount of time that it's motivating?").

I've started using the phrase "existential catastrophe" in my thinking about this; "x-catastrophe" doesn't really have much of a ring to it though, so maybe we need something else that abbreviates better?

So one thing I'm worried about is having a hard time navigating once we're a few episodes in. Perhaps you could link in the main post to the comment for each episode?

4Vaniver
Great idea, will do.

Could this be solved just by posting your work and then immediately sharing the link with people you specifically want feedback from? That way there's no expectation that they would have already seen it. (Granted, this is slightly different from a gdoc in that you can share a gdoc with one person, get their feedback, then share with another person, while what I suggested requires asking everyone you want feedback from all at once.)

2adamShimi
Thanks for the idea! I agree that it probably helps, and it solves my issue with the state of knowledge of the other. That being said, I don't feel like this solves my main problem: it still feel to me as pushing too hard. Here the reason is that I post on a small venue (rarely more than a few posts per day) that I know the people I'm asking feedback too read regularly. So if I send them such a message at the moment I publish, it feels a bit like I'm saying that they wouldn't read and comment it without that, which is a bit of a problem. (I'm interested to know if researchers on the AF agree with that feeling, or if it's just a weird thing that only exists in my head. When I try to think about being at the other end of such a message, I see myself as annoyed, at the very least).

I disagree, I think Kithpendragon did successfully refute the argument without providing examples. Their argument is quite simple, as I understand it: words can cause thoughts, thoughts can cause urges to perform actions which are harmful to oneself, such urges can cause actions which are harmful to oneself. There's no claim that any of these things is particularly likely, just that they're possible, and if they're all possible, then it's possible for words to cause harm (again, perhaps not at all likely, for all Kithpendragon has said, but possible). It b... (read more)

6philh
Without having an object level opinion here (I didn't read the post and I only skimmed the comments), I note that this argument is incomplete. It may be that the set of "urges that cause harmful actions" is disjoint from the set of "urges which can be caused by thoughts which can be caused by words".
0Said Achmiz
It’s worse than a technicality—it’s an equivocation between meanings of “cause”. In ordinary speech we do not speak of one thing “causing” another if, say, the purported “cause” is merely one of several possible (collectively) necessary-but-not-sufficient conditions for the purported “effect” to occur—even though, in a certain sense, such a relationship is “causal”—because if we did, then we would have to reply to “what caused that car accident” with “the laws of physics, plus the initial conditions of the universe”. So kithpendragon proves that words can “cause harm” in the latter technical sense, but the force of the argument comes from the claim that words can “cause harm” in the former colloquial sense—and that has assuredly not been proven. (And this is without even getting into the part about “large cascades of massive change” and “downward spiral of self destruction with truly unfortunate consequences” and such things, that can allegedly be caused by “what seems like a tiny thing”—a claim that is presented without any support at all, but without which the injunction that motivates the post simply fails to follow from any of the rest of it!)

A sneeze can determine much more than hurricane/no hurricane. It can determine the identities of everyone who exists, say, a few hundred years into the future and onwards.

If you're not already familiar, this argument gets made all the time in debates about "consequentialist cluelessness". This gets discussed, among other places, in this interview with Hilary Greaves: https://80000hours.org/podcast/episodes/hilary-greaves-global-priorities-institute/. It's also related to the paralysis argument I mentioned in my other comment.

Upvoted for giving "defused examples" so to speak (examples that are described rather than directly used). I think this is a good strategy for avoiding the infohazard.

3kithpendragon
Agreed. Thanks, Viliam, for pointing at conditions instead of giving direct examples.

I was thinking a bit more about why Christian might have posted his comment, and why the post (cards on the table) got my hackles up the way it did, and I think it might have to do with the lengths you go to to avoid using any examples. Even though you aren't trying to argue for the thesis that we should be more careful, because of the way the post was written, you seem to believe that we should be much more careful about this sort of thing than we usually are. (Perhaps you don't think this; perhaps you think that the level of caution you went to in this p... (read more)

5kithpendragon
That's a really helpful (and, I think, quite correct) observation. I'm not usually quite so careful as all that. This seemed like something it would be really easy to get wrong.

Sorry for the long edit to my comment, I was editing while you posted your comment. Anyway, if your goal wasn't to go all the way to "people need to be more careful with their words" in this post, then fair enough.

5Vaughn Papenhausen
I was thinking a bit more about why Christian might have posted his comment, and why the post (cards on the table) got my hackles up the way it did, and I think it might have to do with the lengths you go to to avoid using any examples. Even though you aren't trying to argue for the thesis that we should be more careful, because of the way the post was written, you seem to believe that we should be much more careful about this sort of thing than we usually are. (Perhaps you don't think this; perhaps you think that the level of caution you went to in this post is normal, given that giving examples would be basically optimizing for producing a list of "words that cause harm." But I think it's easy to interpret this strategy as implicitly claiming that people should be much more careful than they are, and miss the fact that you aren't explicitly trying to give a full defense of that thesis in this post.)

I originally had a longer comment, but I'm afraid of getting embroiled in this, so here's a short-ish comment instead. Also, I recognize that there's more interpretive labor I could do here, but I figure it's better to say something non-optimal than to say nothing.

I'm guessing you don't mean "harm should be avoided whenever possible" literally. Here's why: if we take it literally, then it seems to imply that you should never say anything, since anything you say has some possibility of leading to a causal chain that produces harm. And I'm guessing you don't... (read more)

6Pattern
One way of doing with this is stuff like talking to people in person: with a small group of people the harm seems bounded, which allows for more iteration, as well as perhaps specializing - "what will harm this group? What will not harm this group?" - in ways that might be harder with a larger group. Notably, this may require back and forth, rather than one way communication. For example Also, no one else seems to have used the spoilers in the comments at all. I think this is suboptimal given that moderation is not a magic process although it seems to have turned out fine so far.
4kithpendragon
Yes, I'd agree with all that. My goal was to counter the argument that words can't cause harm. I keep seeing that argument in the wild. Thanks for helping to clarify!
Load More