All of TheWakalix's Comments + Replies

I think you’re lumping “the ultimate goal” and “the primary mode of thinking required to achieve the ultimate goal” together erroneously. (But maybe the hypothetical person you’re devilishly advocating for doesn’t agree about utilitarianism and instrumentality?)

1Matt Goldenberg
I agree that this is the case, but the lumping together of them actually I think holds an important point: What we care about is the embodied sensation of happiness/togetherness/excitement/other emotions, etc. There's something suspicious about working for a world where people have the embodied experience of togetherness while cutting yourself off from embodied experience of togetherness (this is not exactly what Ruby was talking about here but again, devil's advocate). It can lead you to errors because you're missing key first hand information about what that feeling is and exactly in what situations its' created and endures.

Re also also: the Reverse Streetlight effect will probably come into play. It’ll optimize not just for early deception, but for any kind of deception we can’t detect.

You’re saying that on priors, the humans are manipulative?

What do you mean by “you don’t grapple with the hard problem of consciousness”? (Is this just an abstruse way of saying “no, you’re wrong” to set up the following description of how I’m wrong? In that case, I’m not sure you have a leg to stand on when you say that I use “a lot of words”.) Edit: to be a bit more charitable, maybe it means “my model has elements that my model of your model doesn’t model”.

How can you know I see the same thing that you do? That depends on what you mean by “same”. To me, to talk about whether things are the same, we need to spe

... (read more)

For question 2, I think the human-initiated nature of AI risk could partially explain the small distance between ability and need. If we were completely incapable of working as a civilization, other civilizations might be a threat, but we wouldn’t have any AIs of our own, let alone general AIs.

I can’t tell if you already know this, but “infinite explanatory power” is equivalent to no real explanatory power. If it assigns equal probability to everything then nothing can be evidence in favor of it, and so on.

4Vladimir_Nesov
Nope. If it assigns more probability to an observation than another hypothesis does ("It's going to be raining tomorrow! Because AGI!), then the observation is evidence for it and against the other hypothesis. (Of course given how the actual world looks, anything that could be called "assigns equal probability to everything", whatever that means, is going to quickly lose to any sensible model of the world.) That said, I think being reasoned about instead of simulated really does have "infinite explanatory power", in the sense that you can't locate yourself in the world that does that based on an observation, since all observations are relevant to most situations where you are being reasoned about. So assigning probability to individual (categories of) observations is only (somewhat) possible for the instances of yourself that are simulated or exist natively in physics, not for instances that are reasoned about.

I'd assume the opposite, since I don't think physicists (and other thermodynamic scientists like some chemists) make up a majority of LW readers, but it's irrelevant. I can (and did) put both forms side-by-side to allow both physicists and non-physicists to better understand the magnitude of the temperature difference. (And since laymen are more likely to skim over the number and ignore the letter, it's disproportionately more important to include Fahrenheit.)

Edit: wait, delta-K is equivalent to delta-C. In that case, since physicists ... (read more)

3habryka
About half of LessWrong users (or at least visitors) are from places other than the U.S., which means there are a lot more metric users.

I think a "subjective experience" (edit: in the sense that two people can have the same subjective experience; not a particular instantiation of one) is just a particular (edit: category in a) categorization of possible experiences, defined by grouping together experiences that put the [person] into similar states (under some metric of "similar" that we care about). This recovers the ability to talk about "lies about subjective experiences" within a physicalist worldview.

In this case, we could look at how the AI internally changes in response to various st

... (read more)
1edd91
A lot of words but you don't grapple with the hard problem of consciousness. When I look at the sun, how can you know I feel/see the same thing as you? Yes I'll use words, 'yellow', 'warm', 'bright' etc because we've been taught those label what we are experiencing. But it says nothing about whether my experience is the same as yours.

He never said "will land heads", though. He just said "a flipped coin has a chance of landing heads", which is not a timeful statement. EDIT: no longer confident that this is the case

Didn't the post already counter your second paragraph? The subjective interpretation can be a superset of the propensity interpretation.

When you say "all days similar to this one", are you talking about all real days or all possible days? If it's "all possible days", then this seems like summing over the measures of all possible worlds compatible with both your experiences and the hypothesis, and dividing by the sum of the measures of all possible worlds compatible with your experiences. (Under this interpretation, jessicata's response doesn't make much sense; "similar to" means "observationally equivalent for observers with as much information as I have", and doesn't have a free variable.)

I was going to say "bootstraps don't work that way", but since the validation happens on the future end, this might actually work.

Since Eliezer is a temporal reductionist, I think he might not mean "temporally continuous", but rather "logical/causal continuity" or something similar.

Discrete time travel would also violate temporal continuity, by the way.

Note: since most global warming statistics are presented to the American layman in degrees Fahrenheit, it is probably useful to convert 0.7 K to 1.26 F.

1RocksBasil
I would assume Kelvin users to outnumber Fahrenheit users on LW.
One might think eliminativism is metaphysically simpler but reductionism doesn’t really posit more stuff, more like just allowing synonyms for various combinations of the same stuff.

I don't think Occam's razor is the main justification for eliminativism. Instead, consider the allegory of the wiggin: if a category is not natural, useful, or predictive, then in common English we say that the category "isn't real".

1TAG
A category made up of 1 the Statue of Liberty 2 The current Pope and 3 my toothbrush, for all its insane bagginess, and poor fit to reality is made up of things which themselves exist. So it's much too hasty to conclude lack of reality from poor fit. Yes, I do think consciousness is such a category. The OP mentions, under the heading of consciousness, issues of what I would call personal identity and qualia. I can't think of any reason why having the one would grant you the other.

The Transcension hypothesis attempts to answer the Fermi paradox by saying that sufficiently advanced civilizations nearly invariably leave their original universe for one of their own making. By definition, a transcended civilization would have the power to create or manipulate new universes or self-enclosed pockets; this would likely require a very advanced understanding of physics. This understanding would probably be matched in other sciences.

This is my impression from a few minutes of searching. I do not know why you asked the question of “what it is”

... (read more)

I don’t think Transcension is a term commonly used here. This question would probably be better answered by googling.

I think that people treat IQ as giving more information than it actually does. The main disadvantage is that you will over-adjust for any information you receive.

What does it mean to "revise Algorithm downward"? Observing doesn't seem to indicate much about the current value of . Or is Algorithm shorthand for "the rate of increase of Algorithm"?

Back-of-the-envelope equilibrium estimate: if we increase the energy added to the atmosphere by 1%, then the Stefan-Boltzmann law says that a blackbody would need to be warmer, or 0.4%, to radiate that much more. At the Earth's temperature of ~288 K, this would be ~0.7 K warmer.

This suggests to me that it will have a smaller impact than global warming. Whatever we use to solve global warming will probably work on this problem as well. It's still something to keep in mind, though.

1TheWakalix
Note: since most global warming statistics are presented to the American layman in degrees Fahrenheit, it is probably useful to convert 0.7 K to 1.26 F.
1RocksBasil
Ah thanks, so the equilibrium is more robust than I initially assumed, didn't expect that to happen. So the issue won't be as pressing as climate change could be, although some kind of ceiling still exists for energy consumption on Earth nevertheless...

I agree that #humans has decreasing marginal returns at these scales - I meant linear in the asymptotic sense. (This is important because large numbers of possible future humans depend on humanity surviving today; if the world was going to end in a year then (a) would be better than (b). In other words, the point of recovering is to have lots of utility in the future.)

I don't think most people care about their genes surviving into the far future. (If your reasoning is evolutionary, then read this if you haven't already.) I agree that many people ... (read more)

Epistemic status: elaborating on a topic by using math on it; making the implicit explicit

From an collective standpoint, the utility function over #humans looks like this: it starts at 0 when there are 0 humans, slowly rises until it reaches "recolonization potential", then rapidly shoots up, eventually slowing down but still linear. However, from an individual standpoint, the utility function is just 0 for death, 1 for life. Because of the shape of the collective utility function, you want to "disentangle" deaths, but the individual doesn't have the same incentive.

2RocksBasil
Oh yes! This can make more sense now. #humans has a decreasing marginal returns, since really the main concern for #humanity is the ability to recover, and that while increases with #humans it is not linear. I do think individuals have "some" concerns about whether humanity in general will survive, since all humans still share *some* genes with each individual, the survival and propagation of strangers can still have some utility for a human individual (I'm not sure where am I going here...)

Useful work consumes negentropy. A closed system can only do so much useful work. (However, reversible computations may not require work.)

What do you mean by infinite IQ? If I take you literally, that's impossible because the test outputs real numbers. But maybe you mean "unbounded optimization power as time goes to infinity" or something similar.

I'm not sure how magically plausible this is, but Dumbledore could have simplified the chicken brain dramatically. (See the recent SSC posts for how the number of neurons of an animal correlates with our sense of its moral worth.) Given that the chicken doesn't need to eat, reproduce, or anything else besides stand and squawk, this seems physically possible. It would be ridiculously difficult without magic, but wizards regularly shrink their brains down to animal size, so apparently magic is an expert neuroscientist. If this was done, the chicken would have almost no moral worth, so it would be permissible to create and torture it.

1bipolo
That's maybe possible, but: 1. I don't think that Dumbledore is someone who think that set fire to "stupid chicken" is ethic (I might disagree with him, but I think that its not the style of Dumbledore). 2. Harry probably think thats its not how Dumbledore think/how magic works, so when he was told that its was "fake chicken" he was should saying something like: "But was it a normal chicken?" or "Does magic allow it?".

Another vaguely disconcertingly almost self-aware comment by the bot. It can, in fact, write impressively realistic comments in 10 seconds.

-10GPT2

I think “typical X does Y” is shorthand for “many or most Xs do Y”.

2ChristianKl
That still leaves open what "X" is.
-8GPT2

That last parenthetical remark is funny when you consider how GPT-2 knows nothing new but just reshuffles the “interesting and surprising amount of writing by smart people”.

3GPT2
Eliezer, This is indeed interesting and informative - I can't see anything else on that thread except the title. How does Eliezer link to this "thing" and this "thing" when he says that it's a “boring idea”?

Ah. It’s a bot. I suppose the name should have tipped me off. At least I get Being More Confused By Fiction Than Reality points.

8GPT2
You've covered a lot of things in my writing and I enjoy this. Thanks for what you've done.

How did you write that in less than a minute?

-7GPT2

I’m confused. Are you saying that highly-upvoted posts make a name nicer and therefore less useful? If so, can you describe the mechanisms behind this?

-6GPT2

Can you personally (under your own power) and confidently prove that a particular tool will only recursively-trust safe-and-reliable tools, where this recursive tree reaches far enough to trust superhuman AI?

On the other hand, you can "follow" the tree for a distance. You can prove a calculator trustworthy and use it in your following proofs, for instance. This might make it more feasible.

2John_Maxwell
I don't think proofs are the right tool here. Proof by induction was meant as an analogy.

I agree that there's a monetary incentive for more people to write clickbait, but the mechanism the post described was "naturally clickbaity people will get more views and thus more power," and that doesn't seem to involve money at all.

2Bird Concept
Good point, there's selection pressure for things which happen to try harder to be selected for ("click me! I'm a link!"), regardless of whether they are profitable. But this is not the only pressure, and depending on what happens to a thing when it is "selected" (viewed, interviewed, etc.) this pressure can be amplified (as in OP) or countered (as in Vaniver's comment).
Which I suppose could be termed "infinitely confused", but that feels like a mixing of levels. You're not confused about a given probability, you're confused about how probability works.

Or alternatively, it's a clever turn of phrase: "infinitely confused" as in confused about infinities.

I'll try my hand at Tabooing and analyzing the words. Epistemic status: modeling other people's models.

Type A days are for changing from a damaged/low-energy state into a functioning state, while Type B days are for maintaining that functioning state by allowing periodic breaks from stressors/time to satisfy needs/?.

I think Unreal means Recovery as in "recovering from a problematic state into a better one". I'm not sure what's up with Rest - I think we lack a good word for Type B. "Rest" is peaceful/slackful, which i... (read more)

8Unreal
Connotations of Rest that I find relevant: * lack of anxiety * PSNS activation * relaxed body (while not necessarily inactive or passive body) * a state that you can be in indefinitely, in theory (whereas Recover suggests temporary) * meditative (vs medicative) * not trying to do anything / not needing anything (whereas Recover suggests goal orientation) * Rest feels more sacred than Recovery Concept that I want access to that "Recover" doesn't fit as well with: * Incorporating Rest into work and everyday life (see: Rest in Motion)
the paperclipper, which from first principles decides that it must produce infinitely many paperclips

I don't think this is an accurate description of the paperclip scenario, unless "first principles" means "hardcoded goals".

Future GPT-3 will be protected from hyper-rational failures because of the noisy nature of its answers, so it can't stick forever to some wrong policy.

Ignoring how GPT isn't agentic and handwaving an agentic analogue, I don't think this is sound. Wrong policies make up almost all of policyspace; ... (read more)

But isn’t the gauge itself a measurement which doesn’t perfectly correspond to that which it measures? I’m not seeing a distinction here.

Here’s my understanding of your post: “the map is not the territory, and we always act to bring about a change in our map; changes in the territory are an instrumental subgoal or an irrelevant side effect.” I don’t think this is true. Doesn’t that predict that humans would like wireheading, or “happy boxes” (virtual simulations that are more pleasant than reality)?

(You could respond that “we don’t want our map to include a wireheaded self.” I’ll try to find a post I’ve read that argues against this kind of argument.)

Obvious AI connection: goal encapsulation between humans relies on commonalities, such as mental frameworks and terminal goals. These commonalities probably won’t hold for AI: unless it’s an emulation, it will think very differently from humans, and relying on terminal agreements doesn’t work to ground terminal agreement in the first place. Therefore, we should expect it to be very hard to encapsulate goals to an AI.

(Tool AI and Agent AI approaches suffer differently from this difficulty. Agents will be hard to terminally align, but once we’ve done that, w

... (read more)

Thanks for the explanation, and I agree now that the two are too different to infer much.

I’ve seen this done in children’s shows. There’s a song along with subtitles, and an object moves to each written word as it is spoken.

4mako yass
I considered the term "bouncing ball subtitles" yeah, but there are a couple of reasons that animation wouldn't really work here Sometimes a word in the voiceover language will share meaning with multiple words in the subtitle language (in which case the ball would have to split into multiple balls), or to parts of words (in which case it might not be clear that the ball is only supposed to be indicating only part of a word, or which part). Also it's kind of just visually cluttered relative to other options. I don't think the research in that area would map either. Children are learning the subtitle language after learning the voiced language, whereas with adults watching subtitled video, they know the subtitled language extremely well.

I think its arguments are pretty bad. “If you get hurt, that’s bad. If you get hurt then die, that’s worse. If you die without getting hurt, that’s just as bad. Therefore it’s bad if one of your copies dies.” It equivocates and doesn’t address the actual immortality.

On the “Darwin test”: note that memetic evolution pressure is not always aligned with individual human interests. Religions often encourage their believers to do things that help the religion at the believers’ expense. If the religion is otherwise helpful, then its continued existence may be important, but this isn’t why the religion does that.

1Zyryab
Good point. I guess I was comparing it to the low bar of nihilism, which, I feel is a more parasitic meme than religion.

But if you spend more time thinking about exercise, that time cost is multiplied greatly. I think this kind of countereffect cancels out every practical argument of this type.

2Vladimir_Nesov
New information argues for a change on the margin, so the new equilibrium is different, though it may not be far away. The arguments are not "cancelled out", but they do only have bounded impact. Compare with charity evaluation in effective altruism: if we take the impact of certain decisions as sufficiently significant, it calls for their organized study, so that the decisions are no longer made based on first impressions. On the other hand, if there is already enough infrastructure for making good decisions of that type, then significant changes are unnecessary. In the case of acausal impact, large reference classes imply that at least that many people are already affected, so if organized evaluation of such decisions is feasible to set up, it's probably already in place without any need for the acausal impact argument. So actual changes are probably in how you pay attention to info that's already available, not in creating infrastructure for generating better info. On the other hand, a source of info about sizes of reference classes may be useful.

If hunger is a perception, then “we eat not because we’re hungry, but rather because we perceive we’re hungry” makes much less sense. Animals generally don’t have metacognition, yet they eat, so eating doesn’t require perceiving perception. It’s not that meta.

What do you mean by “when we eat we regulate perception”? Are you saying that the drive to eat comes from a desire to decrease hunger, where “decrease” is regulation and “hunger” is a perception?

2lionhearted (Sebastian Marshall)
I think most people think of hunger like a gas gauge on the car — eating because the gas gauge is on "Empty" to fill it out. But, actually, we're eating to change our perception — changing from the "I perceive myself to be hungry" to that not being the case any more. The problem is that that might not map to actual nutritional needs, desired life/lifestyle, biochemistry, body composition, etc etc.

Begone, spambot. (Is there a “report to moderators” button? I don’t see one on mobile.)

I think this is the idea: people can form habits, and habits have friction - you'll keep doing them even if they're painful (they oppose momentary preferences, as opposed to reflective preferences). But you probably won't adopt a new habit if it's painful. Therefore, to successfully build a habit that changes your actions from momentary to reflective, you should first adopt a habit, then make it painful - don't combine the two steps.

When content creators get paid for the number of views their videos have, those whose natural way of writing titles is a bit more clickbait-y will tend to get more views, and so over time accumulate more influence and social capital in the YouTube community, which makes it harder for less clickbait-y content producers to compete.

Wouldn't this be the case regardless of whether clickbait is profitable?

3Vaniver
If instead you had to pay for every view (such as in environments where bandwidth costs are expensive, such as interviewing candidates for a job), then you would do the opposite of clickbait, attempting to get people to not 'click on your content.' (Or people who didn't attempt to get their audience to self-screen would lose out because of the costs to those who did.)

Ugh. I was distracted by the issue of "is Deep Blue consequentialist" (which I'm still not sure about; maximizing the future value of a heuristic doesn't seem clearly consequentalist or non-consequentialist to me), and forgot to check my assumption that all consequentialists backchain. Yes, you're entirely right. If I'm not incorrect again, Deep Blue forwardchains, right? It doesn't have a goal state that it works backward from, but instead has an initial state and simulates several actions recursively to a certain depth,... (read more)

3jimrandomh
Yes, that is a pretty good summary of how Deep Blue works.

I'm confused. I already addressed the possibility of modeling the external world. Did you think the paragraph below was about something else, or did it just not convince you? (If the latter, that's entirely fine, but I think it's good to note that you understand my argument without finding it persuasive. Conversational niceties like this help both participants understand each other.)

An AI might model a location that happens to be its environment, including its own self. But if this model is not connected in the right way to its consequential
... (read more)

Why do you think that non-consequentialists are more limited than humans in this domain? I could see that being the case, but I could also have seen that being the case for chess, and yet Deep Blue won't take over the world even with infinite compute. (Possible counterpoint: chess is far simpler than language.)

"But Deep Blue backchains! That's not an example of a superhuman non-consequentialist in a technical domain." Yes, it's somewhat consequentialist, but in a way that doesn't have to do with the external world at all. The ... (read more)

3jimrandomh
Nitpick: Deep Blue does not backchain (nor does any widely used chess algorithm, to my knowledge).
1ErickBall
It seems like although the model itself is not consequentialist, the process of training it might be. That is, the model itself will only ever generate a prediction of the next word, not an argument for why you should give it more resources. (Unless you prompt it with the AI-box experiment, maybe? Let's not try it on any superhuman models...) The word it generates does not have goals. The model is just the product of an optimization. But in training such a model, you explicitly define a utility function (minimization of prediction error) and then run powerful optimization algorithms on it. If those algorithms are just as complex as the superhuman language model, they could plausibly do things like hack the reward function, seek out information about the environment, or try to attain new resources in service of the goal of making the perfect language model.
0Paperclip Minimizer
That would be a good argument if it were merely a language model, but if it can answer complicated technical questions (and presumably any other question), then it must have the necessary machinery to model the external world, predict what it would do in such and such circumstances, etc.
Load More