All of tristanm's Comments + Replies

A couple of guesses for why we might see this, which don't seem to depend on property:

  • An obligation to act is much more freedom-constraining than a prohibition on an action. The more and more one considers all possible actions with the obligation to take the most ethically optimal one, the less room they have to consider exploration, contemplation, or pursuing their own selfish values. Prohibition on actions does not have this effect.
  • The environment we evolved in had roughly the same level of opportunity to commit harmful acts, bur far less opportunity
... (read more)

It seems to construct an estimate of it by averaging a huge number of observations together before each update (for Dota 5v5, they say each batch is around a million observations, and I'm guessing it processes about a million batches). The surprising thing is that this works so well, and it allows leveraging of computational resources very easily.

My guess for how it deals with partial observability in a more philosophical sense is that it must be able to store an implicit model of the world in some way, in order to better predict the reward it will eventu... (read more)

8jessicata
After thinking about this more, I have a hypothesis for how it works: It records the sequence of states resulting from play, to be used for learning when the game is done. It has some Q value estimator, which given a state and an action, returns the expected utility of taking that action in that state (of course, this expected utility depends on the policies both players are using). Using the sequence of states and the actions (including exploration actions), it makes an update to this Q value estimator. But this Q value estimator is not directly usable in determining a policy, since the Q value estimator needs to know the full state. Instead, the Q value estimator is used to compute a gradient update for the policy. After the game, the Q value estimator can tell you the value of taking each possible action at each actual time step in the game (since all game states are known at the end of the game); this can be used to update the policy so it is more likely to take the actions that are estimated to be better after the game. I'm not sure if this hypothesis is correct, but I don't currently see anything wrong with this algorithm. Thank you for causing me to think of this. (BTW, there is a well-defined difference between full and partial observability, see MDP vs POMDP. Convergence theorems for vanilla Q learning will require the game to be a MDP rather than a POMDP.)

I don't know how hard it would be to do a side by side "FLOPS" comparison of Dota 5v5 vs AlphaGo / AlphaZero, but it seems like they are relatively similar in terms of computational cost required to achieve something close to "human level". However, as has been noted by many, Dota is a game of vastly more complexity because of its continuous state, partial observability, large action space, and time horizon. So what does it mean when it requires roughly similar orders of magnitude of compute to achieve the same level of ability as human... (read more)

I've been meditating since I was about 19, and before I came across rationality / effective altruism. There is quite a bit of overlap between the sets of things I've been able to learn from both schools of thought, but I think there are still a lot of very useful (possibly even necessary) things that can only be learned from meditative practices right now. This is not because rationality is inherently incapable of learning the same things, but because within rationality it would take very strong and well developed theories, perhaps developed through large ... (read more)

It seems like in the vast majority of conversations, we find ourselves closer to the "exposed to the Deepak Chopra version of quantum mechanics and haven't seen the actual version yet" situation than we do to the "Arguing with someone who is far less experienced and knowledgeable than you are on this subject." In the latter case, it's easy to see why steelmanning would be counterproductive. If you're a professor trying to communicate a difficult subject to a student, and the student is having trouble understanding your position, it's u... (read more)

I don't see him as arguing against steelmanning. But the opposite of steelmanning isn't arguing against an idea directly. You've got to be able to steelman an opponent's argument well in order to argue against it well too, or perhaps determine that you agree with it. In any case, I'm not sure how to read a case for locally valid argumentation steps as being in favor of not doing this. Wouldn't it help you understand how people arrive at their conclusions?

3ChristianKl
There are plenty of times where someone writes a LessWrong post and while I do agree with the central point of the post I disagree with a noncentral part of the post. A person might use some historical example and I disagree with the example. In those cases it's for me an open question whether or not it's useful to write the comment that disagrees or whether that's bad for LW. It might be bad because people feel like they are getting noncentral feedback and that discourages them.

I would also like to have a little jingle or ringtone play every time someone passes over my comments, please implement for Karma 3.0 thanks

What's most unappealing to me about modern, commercialized aesthetics is the degree to which the bandwidth is forced to be extremely high - something I'd call the standardization of aesthetics. When I walk down the street in the financial district of SF, there's not much variety to be found in people's visual styles. Sure, everything looks really nice, but I can't say that it doesn't get boring after a while. It's clear that a lot of information is being packed into people's outfits, so I should be able to infer a huge amount about someone just by looking ... (read more)

1Said Achmiz
Which one is that?

It seems like this objection might be empirically testable, and in fact might be testable even with the capabilities we have right now. For example, Paul posits that AlphaZero is a special case of his amplification scheme. In his post on AlphaZero, he doesn't mention there being an aligned "H" as part of the set-up, but if we imagine there to be one, it seems like the "H" in the AlphaZero situation is really just a fixed, immutable calculation that determines the game state (win/loss/etc.) that can be performed with any board input... (read more)

I can't emphasize enough how important the thing you're mentioning here is, and I believe it points to the crux of the issue more directly than most other things that have been said so far. 

We can often weakman postmodernism as making basically the same claim, but this doesn't change the fact that a lot of people are running an algorithm in their head with the textual description "there is no outside reality, only things that happen in my mind." This algorithm seems to produce different behaviors in people than if they were running... (read more)

I could probably write a lot more about this somewhere else, but I'm wondering if anyone else felt that this paper seemed to be kind of shallow. This comment is probably too brief to really do this feeling justice, but I'll probably decompose this into two things I found disappointing:

  1. "Intelligence" is defined in such a way that leaves a lot to be desired. It doesn't really define it in a way that makes it qualitatively different than technology in general ("tasks thought to require intelligence" is probably much less us
... (read more)
6Qiaochu_Yuan
My cynical take is that the point of writing papers like this is for them to be cited, not read.

This is partly a reply and partly an addendum to my first comment. I've been thinking about a sort of duality that exists within the rationalist community to a degree and that has become a lot more visible lately, in particular with posts like this. I would refer to this duality as something like "The Two Polarities of Noncomformists", although I'm sure someone could think of something better to call it. The way I would describe it is that, communities like this one are largely composed of people who feel fairly uncomfortable with the w... (read more)

5moridinamael
I think this dichotomy carves reality pretty well. Nice comment. I’m reminded of the different approaches to magic described in various stories. In some stories magic is ineffable. The characters never really understand it. They use it intuitively, and its functioning tends to depend on emotional states or degrees of belief or proper intentions. Wizardry is more like art than science. In another type of story, magic is mechanical. A mage learns precise words, movements or rituals to operate a kind of invisible machine that serves up magical results. Wizardry is not unlike being an engineer or programmer. I think that you can view real life as having both qualities. That’s probably why these two views of magic have any appeal in the first place. I find it more appealing to be the kind of mage who understands the nuts and bolts. To stretch the metaphor probably too far, it’s all well and good to know a long, complex ritual that summons a demon, but I find it more aesthetically appealing to understand what elements of that ritual are load bearing and then just do those. And maybe that means I just do the “spell” in my head in five seconds instead of performing a lengthy narrative-conforming ritual. Maybe magic will twist the world so that one doesn’t miss their connection with the Buddhist monk in NYC. (I super-duper doubt it, though. This is actually just classic hindsight bias.) I would rather rely on basic planning principles to get the same outcome. At least then the causal story is actually true. And if my planning approach fails, then I can learn from that, rather than having the Mythic approach fail, and being forced to shrug and accept that this is the outcome the cosmos wanted.

I think I've been doing "mythic mode" for nearly my entire life, but not because I came up with this idea explicitly and intentionally deployed it, but just because it sort of happens on its own without any effort. It's not something that happens at every moment, but frequently enough that it's a significant driving force.

But it used to be that I didn't notice myself doing this at all, this is just how it felt to be a person with goals and desires, and doing mythic mode was just how that manifested in my subconscious desires. ... (read more)

5Valentine
A lot of what you say here is why I think it's maybe really important to learn how to sandbox mythic mode, even if you don't want to intentionally use it. Otherwise I think something like it seeps into your system anyway. Yep! I debated framing it this way, but I eventually decided against it because I thought it would be distracting here. And as you say, I rederived the ideas, and then later noticed that they corresponded to my read of what Jung was talking about… and not having really read Jung in any depth, I didn't want to tie my ideas to other things he might have claimed. Mmm… not exactly. More like, I posit that it has scripts, and guides people to play them out. This often involves an element of predicting people's actions, but it's more a matter of predicting what kinds of actions someone is likely to take. "What kind of person is this?" rather than "What is this person going to do?" I think that's close enough. I'd just add the caveat that by my model, people mostly can't intentionally stray from paths. There are exceptions, but they're relatively rare, and when done without finesse it can create some pretty ferocious responses. Like, I suspect that psychopathy is in part being unaffected by Omega's tugs, and people generally really really don't like others to be quite that free. Yep, I agree, that's important, and the framework says that it's extremely difficult for the most part (except where it doesn't matter to the "scene", or where it's about things that aren't subject to scripts the way physics isn't). This is another way of stating what I see as a core challenge for a mature art of rationality to gracefully navigate.

Or more generally: Break up a larger and more difficult task into a chain of subtasks, and see if you have enough willpower to accomplish just the first piece. If you can, you can allow yourself a sense of accomplishment and use that as a willpower boost to continue, or try to break up the larger task even further.

If this works and people are able to get themselves to do more complex and willpower heavy tasks that they wouldn't normally be able to do, wouldn't that be a good thing by default? Or are you worried that it would allow people with poorly aligned incentives to do more damage?

3Qiaochu_Yuan
No, I'm worried about people hurting themselves doing this, not others. In general I have a lot of concerns around people forcing themselves to do things - my model is that this amounts to some of their parts subjugating other parts, and I both think this is just bad and leads to visibly bad consequences down the line like burning out.

Circling seems like one of those things where both its promoters and detractors vastly overestimate its effects, either positive or negative. Like a lot the responses to this are either "it's pretty cool" or "it's pretty creepy." What about "meh"? The most likely outcome is that Circling does something extremely negligible if it does anything at all, or if it does seem to have some benefit it's because of the extra hour you set aside to think about things without many other distractions. In which case, a questio... (read more)

6John_Maxwell
Circling was "meh" for me. Maybe people who find it "meh" aren't as motivated to talk about it, so we get selection effects.
9Qiaochu_Yuan
Circles can vary extremely widely based on who's in them and how skilled the facilitators are, so it's not surprising that people have both widely varying experiences and widely varying senses of the possible range of experiences. (Again, the analogy to sex is helpful here.) I want to generally caution everyone in this discussion, both promoters and detractors, to avoid updating too strongly based only on their own circles. I can repeat from my other comment that circling has been extremely helpful for me personally and also that this is probably in large part because I've gotten to work with unusually skilled facilitators. I'm not surprised to hear that other people have very different and neutral or even much more negative experiences. A facilitator who's Goodharting on the wrong thing can be very bad, especially if no one else in the circle is experienced enough to notice and call them out on it.
5PDV
Producing a strong emotional attachment to the activity and thinking it's really great, is itself a significant, negative effect.

Personally I wonder how much of this disagreement can be attributed to prematurely settling on specifc fundamental positions or some hidden metaphysics that certain organizations have (perhaps unknowingly) committed to - such as dualism or pansychism. One of the most salient paragraphs from Scott's article said:

Morality wasn’t supposed to be like this. Most of the effective altruists I met were nonrealist utilitarians. They don’t believe in some objective moral law imposed by an outside Power. They just think that we should pursue our own human-paroc
... (read more)

I'm actually having a hard time deciding what kind of superstimuli are having too strong of a detrimental effect on my actions. The reason for this is that some superstimuli also act as a willpower restorer. Take music, for example. Listening to music does not usually get mentioned as a bad habit, but it also is an extremely easy stimuli to access, requires little to no attention or effort to maintain use of, and at least for me, tends to amplify the degree of mind wandering and daydreaming. On the other hand, it is a huge mood booster and increases c... (read more)

5Qiaochu_Yuan
Optimizing a schedule is cognitively demanding and constantly tests your willpower; giving up for 40 days is simple, if not easy. On the other hand, if there's no part of you that feels worried about music then I wouldn't give up music in your posiiton. You could try taking up a new habit / hobby instead.

I'm not actually seeing why this post is purely an instance of conjunctive fallacy. A lot of the details he describes are consequences of cars being autonomous or indirect effects of this. And that's not to say there are no errors here, just that I don't think it's merely a list of statements A,B,C,etc with no causal relationship.

If you define "rationality" as having good meta level cognitive processes for carving the future into a narrower set of possibilities in alignment with your goals, then what you've described is simply a set of relatively poor heuristics for one specific set of goals, namely, the gaining of social status and approval. One can have that particular set of goals and still be a relatively good rationalist. Of course, where do you draw the line between "pseudo" and "actual" given that we are all utilizing cognitive heuristics to some degree? I see the line being drawn as sort of arbitrary.

I think the driving motivator for seeking out high variance in groups of people to interact with is an implicit belief that my value system is malleable and a stong form of modesty concerning my beliefs about what values I should have. Over time I realized that my value system isn't really all that malleable and my intuitions about it are much more reliable indicators than observing a random sample of people, therefore a much better strategy for fulfilling goals set by those values is to associate with people who share them.

If the latter gets too large, then you start getting swarmed with people who want money and prestige but don't necessarily understand how to contibute, who are incentivized to degrade the signal of what's actually important.

During this decade the field of AI in general became one of the most prestigious and high-status academic fields to work in. But as far as I can tell, it hasn't slowed down the rate of progress in advancing AI capability. If anything, it has sped it up - by quite a bit. It's possible that a lot of newcomers to the f... (read more)

2Raemon
This is an interesting point I hadn't considered. Still mulling it over a bit.

I'm curious as to whether or not the rationalsphere/AI risk community has ever experimented with hiring people to work on serious technical problems who aren't fully aligned with the values of the community or not fully invested in it already. It seems like ideological alignment is a major bottleneck to locating and attracting relevant skill levels and productivity levels, and there might be some benefit to being open about tradeoffs that favor skill and productivity at the expense of not being completely committed to solving AI risk.

2Ben Pace
I wrote a post with some reasons to be skeptical of this.

(Re-writing this comment from the original to make my point a little more clear).

I think it is probably quite difficult to map the decisions of someone on a continuum from really bad to really good if you can't simulate the outcomes of many different possible actions. There's reason to suspect that the "optimal" outcome in any situation looks vastly better than even very good but slightly sub-optimal decisions, and vise-versa for the least optimal outcome.

In this case we observed a few people who took massive risks (by devoting their ... (read more)

Mostly I just want people to stop bringing models about the other person's motives or intentions into conversations, and if tabooing words or phrases won't accomplish that, and neither will explicitly enforcing a norm, then I'm fine not going that route. It will most likely involve simply arguing that people should adopt a practice similar to what you mentoned.

Confusion in the sense of one or both parties coming to the table with incorrect models is a root cause, but this is nearly always the default situation. We ostensibly partake in a conversation in order to update our models to more accurate ones and reduce confusion. So while yes, a lack of confusion would make bad conversations less likely, it also just reduces the need for the conversation to begin with.

And here we’re talking about a specific type of conversation that we’ve claimed is a bad thing and should be prevented. Here we need to identify a diffe... (read more)

1zulupineapple
We can say that all disagreements start with confusion. Then I claim that if the confusion is quickly resolved, or if one of the parties exits the conversation, then the thread is normal and healthy. And that in all other cases the thread is demonic. Not all confusion is created equal. I'm claiming that the depth of this initial confusion is the best predictor of demon threads. I can understand why status games would prevent someone from exiting, but people ignoring their deep confusions is not a good outcome, so we don't really want them to exit, we want them to resolve it. I don't really see how status games could deepen the confusion. I'd call that "confusion about what the other party thinks", and put it under the umbrella of general confusion. In fact that's the first kind of confusion I think about, when I think of demonic threads, but object-level confusion is important too. Maybe we aren't disagreeing?

I'm not really faulting all status games in general, only tactics which force them to become zero-sum. It's basically unreasonable to ask that humans change their value systems so that status doesn't play any role, but what we can do is alter the rules slightly so that outcomes we don't like become improbable. If I'm accused of being uncharitable, I have no choice but to defend myself, because being seen as "an uncharitable person" is not something I want to be included in anyone's models of me (even in the case wher... (read more)

1zulupineapple
It is bad to discuss abstract things. Do you agree that Kensho is an example of a demon thread? Is it a first type or second type? How about the subthread that starts here? I claim that it's all "second type". I claim that "first type", status-game based demon threads without deep confusion, if they exist at all, aren't even a problem to anyone. I claim that if, in a thread, there are both status games and deep confusion, the games are caused by the frustration resulting from the confusion, not the other way around. Confusion is the real root problem. Are they "usual", mundane problems? Do we know of any good solutions? Do we at least have past discussions about them? Why not? This is not obvious to me.

If the goal is for conversations to be making epistemic progress, with the caveat that individual people have additional goals as well (such as obtaining or maintaining high status within their peer group), and Demon Threads “aren’t important” in the sense that they help neither of these goals, then it seems the solution would simply be better tricks participants in a discussion can use in order to notice when these are happening or likely to happen. But I think it’s pretty hard to actually measure how much status is up for grabs in a given conversation. ... (read more)

5Raemon
Others have approached this from slightly different angles, but I'd say "you're being uncharitable" is a symptom rather than a cause. If the conversation gets to the point where someone doesn't trust their conversation partner, something has already gone wrong.
3Qiaochu_Yuan
Hoo, boy, I think tabooing language that looks explicitly status-y is both a bad idea and won't even get you what you want - anyone who really wants to do status stuff will just find more obfuscated language for doing it (including me). I would probably like it if people went more in the NVC / Circling direction, away from claims about someone else and towards claims about themselves, e.g. "I feel frustrated" as opposed to "you're being uncharitable," but the way you get people to do this is not by tabooing or even by recommending tabooing.
1zulupineapple
I propose that some people may say it because it is true and because they have a naive hope that the other party would try to be more charitable if they said it. All disagreements are zero sum, in the sense that one party is right and the other is wrong. A disagreement in only positive sum when your initial priors are so low that the other side only needs a few comments of text to provide sufficient information to change your mind, in other words, when you don't know what you're talking about. On the other hand, if you've already spent an hour in your life thinking about the topic, then you've probably already considered and dismissed the kinds of arguments the other side will bring up (and that's assuming that you managed to explain what you view is well enough, so that their arguments are relevant to begin with). Frankly, I'm bothered by how much you blame status games, while completely ignoring the serious challenges of identifying and resolving confusion.
4gwillen
I think statements about models of a conversation partner's intent can be good or bad. They are bad if they're being used as accustations. They're potentially good if they're used in the context of a request for understanding (e.g. "I feel like your tone in this post is hostile -- was that your intention?") I don't see the latter much outside of the LW-sphere, but when I do see it, I think it has value.

I'm trying to decide whether or not I understand what "looking" is, and I think it's possible I do, so I want to try and describe it, and hopefully get corrected if it turns out I'm very wrong. 

Basically, there's sort of a divide between "feeling" and "Feeling" and it's really not obvious that there should be, since we often make category errors in referring to these things. On the one hand, you might have the subjective feeling of pain, like putting your hand on something extremely hot. Part of that f... (read more)

2moridinamael
My understanding is that the skill you describe here is a prerequisite for what Valentine describes as Kensho or stream-entry. Stream-entry/Kensho refer to a broader kind of "getting it", a brief grasping of the illusoriness of the self and comprehenion of the oneness of all things, etc. I would add that there appear to be numerous different schools of thought on what these terms actually refer to. Some will say that you haven't achieved the target state unless you've experienced a jarring "cessation event", in which you witness your conscious mind blink out and then come back online, and this event prompts a certain set of realizations about the nature of the mind. Some other schools don't seem to regard the cessation event as necessary. I too would like to get a clear answer on this, because terms like "kensho" tend to have more than one possible interpretation.
3ChristianKl
Given what Val did with his shoulder after the operation it would surprise me a lot if he wouldn't have been able to make that distinction pre-kenshō for a sensation like pain. I have the impression that Val points to things that are more advanced than that.
5Said Achmiz
I would very much like to know whether this is, in fact, related to what Valentine is talking about. (I have much to say in response to this comment, but don’t want to start a long thread about it if it would be off-topic.)
We need to understand how the black box works inside, to make sure our version's behavior is not just similar but based on the right reasons.

I think here "black-box" can be used to refer to two different things, one to refer to things in philosophy or science which we do not fully understand yet, and also to machine learning models like neural networks that seem to capture their knowledge in ways that are uninterpretable to humans.

We will almost certainly require the use of machine learning or AI to model systems that are beyond our capabi... (read more)

2cousin_it
I think what you're describing is possible, but very hard. Any progress in that direction would be much appreciated, of course.
3LawrenceC
I agree! There's a distinction between "we know exactly what knowledge is represented in this complicated black box" and "we have formal guarantees about properties of the black box". It's indeed very different to say "the AI will have a black box representing a model of human preferences" and "we will train the AI to build a model of human preferences using a bootstrapping scheme such as HCH, which we believe works because of these strong arguments". Perhaps more crisply, we should distinguish between black-boxes where we have a good grasp of why the box will behave as expected, and black boxes which we have little ability to reason about their behavior at all. I believe that both cousin_it and Eliezer (in the Artificial Mysterious Intelligence post), are referring to the folly of using the second type of black box in AI designs. Perhaps related: Jessica Taylor's discussion on top-level vs subsystem reasoning.

A sort of fun game that I’ve noticed myself playing lately is to try and predict the types of objections that people will give to these posts, because I think once you sort of understand the ordinary paranoid / socially modest mindset, they become much easier to predict.

For example, if I didn’t write this already, I would predict a slight possibility that someone would object to your implication that requiring special characters in passwords is unnecessary, and that all you need is high entropy. I think these types of objections could even contain some p... (read more)

0SquirrelInHell
Haha, it is also predictable that the very same people will read your comment and not get it. Salute

My understanding of A/B testing is that you don't need an explicit causal model , or a "big theory" in order to successfully use it, you mostly would be using intuitions gained from experience in order to test hypotheses like "users like the red page better than the blue page", which has no explicit causal information.

Here you argue that intuitions gained from experience count as hypotheses just as much as causal theories do, and not only that, but that they tend to succeed more often than the big theories do. That depends on what... (read more)

I wonder if it would have been as frustrating if he had instead opened with "The following are very loosely based on real conversations I've had, with many of the details changed or omitted." That's something many writers do and get away with, for the very reason that sometimes you want to show that someone actually thinks what you're claiming people think, but you don't actually want to be adversarial to the people involved. Maybe it's not the fairest to the specific arguments, but the alternative could quite possibly tu... (read more)

There’s a fundamental assumption your argument rests on which is a choice of prior: Assume that everyone’s credences in a given proposition is a distribution centered around the correct value an ideal agent would give to that proposition if they had access to all the information that was available and relevant to that proposition and had enough time and capacity to process that information to the fullest extent. Your arguments are sound given that the above is actually the correct prior, but I see most of your essay as arguing why modesty would be the corr... (read more)

I believe that equilibria has already arrived...and at no real surprise, since no preventive measures were ever put into place.

The reason this equilibria occurs is because there is a social norm that says "upvote if this post is both easy to understand and contains at least one new insight." If a post contains lots of deep and valuable insights, this increases the likelihood that it is complex, dense, and hard to understand. Hard to understand posts often get mistaken for poor writing (or worse, will be put in a separate class and compared agai... (read more)

8Screwtape
Counterargument, depending on what you view the goal of the rationality movement is: If we want to raise the sanity waterline and get the benefits from having a lot of people armed with new insights, we need to be able to explain those insights to people. Take literacy- there's a real benefit to getting a majority of a population fluent in that technique, something distinct from what you get from having a few people able to read. Imagine if the three rationality techniques most important to you were so widespread that it would be genuinely surprising if a random adult on the street wasn't capable with them. What would the last year have looked like if every adult knew that arguments are not soldiers, or that beliefs should pay rent? (My social media would be a much more pleasant place if everyone on it knew that nobody is perfect but everything is commensurable.) We teach math by starting with counting, then addition, then subtraction, then multiplication, and so on until differential equations or multivariable calculus or wherever ones math education stops. One can argue that we teach math badly (and I would be pretty sympathetic to that argument) but I don't think "too many easy to understand lessons that teach only one new insight" is the problem. I might go so far as to say we need multiple well written articles on the most important insights, written in a variety of styles to appeal to a wide variety of reader.
9habryka
(This is one of the primary reasons why a post being in featured is not decided by the number of upvotes or downvotes, but by moderator decision. We have a bunch of ability to push back against this incentive gradient.)

I think the post could also be interpreted as saying, "when you select for rare levels of super-competence in one trait, you are selecting against competence in most other traits" or at least, "when you select for strong charisma and leadership ability, you are selecting for below average management ability." It's a little ambiguous about how far this is likely to generalize or just how strongly specific skills are expected to anti-correlate.

I think the central reason it’s possible for an individual to know something better than the solution currently prescribed by mainstream collective wisdom is that the vastness in the number of degrees of freedom in optimizing civilization guarantees that there will always be some potential solutions to problems that simply haven’t received any attention yet. The problem space is simply way, way too large to expect that even relatively easy solutions to certain problems are known yet.

While modesty may be appropriate in situations regarding a problem tha... (read more)

I think avoiding status games is sort of like trying to reach probabilities of zero or one: Technically impossible, but you can get arbitrarily close, to the point where trying to measure the weight that status shifts are assigned within everyone's decision making is lowered to be almost non-measurable.

I'm also not sure I would define "not playing the game" as within a group, making sure that everyone's relative status is the same. This is simply a different status game, just with different objectives. It seems to me that what you

... (read more)

I understand that there may be costs to you for continued interaction with the site, and that your primary motivations may have shifted, but I will say that your continued presence may act as a buffer that slows down the formation of an orthodoxy, and therefore you may be providing value by remaining even if the short term costs remain negative for a while.

5Screwtape
Hrm. I would like it if Conor stuck around, since I think the content produced in the last 30 days was enjoyable and helpful to me, but I also think paying costs to slow down the formation of an LW orthodoxy that doesn't align with his goals would be a bad investment of energy. If it was costless or very low cost or if preventing the orthodoxy/causing it to form in a way that aligned with his goals was possible, then it would probably be worth it. I am not in Conor's head, but if I was in their place I wouldn't be convinced to stick around as just a delaying tactic. A much more convincing reason might be to stick around, take notes of who does engage with me the way I wanted to engage with people, and then continue to post here while mostly just paying attention to those people.

Disagreements here are largely going to revolve around how this observation and similar ones are interpreted. This kind of evidence must push us in some direction. We all agree that what we saw was surprising - a difficult task was solved by a system with no prior knowledge or specific information to this task baked in. Surprise implies a model update. The question seems to be which model.

The debate referenced above is about the likelihood of AGI "FOOM". The Hansonian position seems to be that a FOOM is unlikely because obtaining generality acro

... (read more)
1scarcegreengrass
The techniques you outline for incorporating narrow agents into more general systems have already been demoed, I'm pretty sure. A coordinator can apply multiple narrow algorithms to a task and select the most effective one, a la IBM Watson. And I've seen at least one paper that uses a RNN to cultivate a custom RNN with the appropriate parameters for a new situation.

The only things that are required, I believe, is that the full state of the game can be fed into the network as input, and that the action space is small enough to be represented by network output and is discrete, which allows MCTS to be used. If you can transform an arbitrary problem into this formulation then in theory the same methods can be used.

I think I’m going to stake out a general disagreement position with this post, mainly because: 1) I mostly disagree with it (I am not simply being a devil’s advocate) and 2) I haven’t seen any rebuttals to it yet. Sorry if this response is too long, and I hope my tone does not sound confrontational.

When I first read Eliezer’s post, it made a lot of sense to me and seemed to match with points he’s emphasized many times in the past. I would make a general summarization of the points I’m referring to as: There have been many situations throughout history whe

... (read more)

It's very likely that the majority of ethical discussion in AI will become politicized and therefore develop a narrow Overton window, which won't cover the actually important technical work that needs to be done.

The way that I see this happening currently is that ethics discussions have come to largely surround two issues: 1) Whether the AI system "works" at all, even in the mundane sense (could software bugs cause catastrophic outcomes?) and 2) Is it being used to do things we consider good?

The first one is largely just a question of

... (read more)
3John_Maxwell
AI safety is already a pretty politicized topic. Unfortunately, the main dimension I see it politicized on is the degree to which it's a useful line of research in the first place. (I think it's possible that the way AI safety has historically been advocated for might have something to do with this.) Some have argued that "AI ethics" will help with this issue.

The interesting question to me is, are sociopaths ever useful to have or are they inherently destructive and malign? You used the analogy of an anglerfish, and I don't think there are many better comparisons you would want to make if your goal is to show something at the furthest (negative) edge of what we consider aesthetically pleasing - one of the most obviously hostile-looking creatures on the planet. To me that certainly seems intentional.

There are sort of three definitions of "sociopath" that get used here, and they often overlap, or

... (read more)
4Benquo
The distinction between these types of "sociopath" is quite important, thanks for making it explicit.

Would you mind fleshing this out a bit more? I feel like when you say "overrate Ra" this could be meant in more than one sense - i.e., to overvalue social norms or institutions in general, or, in the specific sense of this discussion, to regard sociopaths as having more inherent worth to an institution or group than they truly have.

3Benquo
Happy to try! I mean specifically those things, not social institutions generally. Ra and Sociopaths are both optimized for getting people to wave a flag that says X, but not necessarily for getting people to do X. Less confident about this, but I think a lot of the perceived value of Sociopaths is just that they're willing to give MOPs instructions, when Geeks are confused and trying to treat the MOPs like defective Geeks instead of their own thing. (I am totally guilty of this.)

The geeks, ideally, prefer to keep their beacon emitting on a very narrow, specific frequency band only – they’d prefer to keep all others out besides genuine geeks, and therefore their signal will need to be practically invisible to everyone else. This is kind of how I imagine the early proto-rationalist subcultures, you basically have to know exactly where to look to be able to find them, and already possess a large fraction of the required background knowledge.

It would have continued like that, if it wasn’t for the fact that they eventually needed peo

... (read more)
4Benquo
I think you're underrating MOPs here and overrating sociopaths, for the same reason people overrate Ra.

To be clear, I do not believe that trying to create such a conspiracy is feasible, and wanted to emphasize that even if it were possible, you'd still need to have a bunch of other problems already solved (like making an ideal truth-seeking community). Sometimes it seems that rationalists want to have an organization that accomplishes the maximum utilitarian good, and hypothetically, this implies that some kind of conspiracy - if you wish to call it that - would need to exist. For a massively influential and secretive conspiracy, I might assign a <

... (read more)

Let’s suppose we solve the problem of building a truth-seeking community that knows and discovers lots of important things, especially the answers to deep philosophical questions. And more importantly, let's say the incentives of this group were correctly aligned with human values. It would be nice to have a permanent group of people that act as sort of a cognitive engine, dedicated to making sure that all of our efforts stayed on the right track and couldn’t be influenced by outside societal forces, public opinion, political pressure, etc. Like some

... (read more)
2whpearson
Anything that is reliable influential seems like it would be attacked by individuals seeking influence. Maybe it needs to be surprisingly influential, like the surprising influence of the consents in Anathem (for those that haven't read it, there is a group of monks who are regularly shut off from the outside would and have little influence, but are occasionally emerge into the real world and are super effective at getting stuff done). I think EA might be able to avoid stagnation if there is a healthy crop of new organisations that spring up and it is not just dominated by behemoths. So perhaps expect organisations to be single shot things, create lots of them and then try and rely on the community to differentially fund the organisations as we decide whatever is needed.
2Chris_Leong
Interesting comment, but the way you've written it makes it sound like there is some kind of conspiracy which does not exist and which would fail anyway if it was attempted.

But a thing that strikes me here is this: the LW community really doesn't, so far as I can tell, have the reputation of being a place where ideas don't get frank enough criticism.

But the character of LW definitely has changed. In the early days when it was gaining its reputation, it seemed to be a place where you could often find lots of highly intense, vociferous debate over various philosophical topics. I feel that nowadays a lot of that character is gone. The community seems to have embraced a much softer and more inclusive discourse style, t

... (read more)

When you talk about “truthseekers”, do you mean someone who is interested in discovering as many truths as possible for themselves, or someone who seeks to add knowledge to the collective knowledge base?

If it’s the latter, then the rationalsphere might not be so easy to differentiate from say, academia, but if it’s the former, that actually seems to better match with what I typically observe about people’s goals within the rationalsphere.

But usually, when someone is motivated to seek truths for themselves, the “truth” isn’t really an end in and of itse

... (read more)
Load More