All of dogiv's Comments + Replies

dogiv70

And to elaborate a little bit (based on my own understanding, not what they told me) their RSP sort of says the opposite. To avoid a "race to the bottom" they base the decision to deploy a model on what harm it can cause, regardless of what models other companies have released. So if someone else releases a model with potentially dangerous capabilities, Anthropic can't/won't use that as cover to release something similar that they wouldn't have released otherwise. I'm not certain whether this is the best approach, but I do think it's coherent.

9Zach Stein-Perlman
Yep: Source
dogiv3411

I explicitly asked Anthropic whether they had a policy of not releasing models significantly beyond the state of the art. They said no, and that they believed Claude 3 was noticeably beyond the state of the art at the time of its release. 

7dogiv
And to elaborate a little bit (based on my own understanding, not what they told me) their RSP sort of says the opposite. To avoid a "race to the bottom" they base the decision to deploy a model on what harm it can cause, regardless of what models other companies have released. So if someone else releases a model with potentially dangerous capabilities, Anthropic can't/won't use that as cover to release something similar that they wouldn't have released otherwise. I'm not certain whether this is the best approach, but I do think it's coherent.
dogiv130

The situation at Zaporizhzhia (currently) does not seem to be an impending disaster. The fire is/was in an administrative building. Fires at nuclear power plants can be serious, but the reactor buildings are concrete and would not easily catch fire due to nearby shelling or other external factors.

Some click-seekers on Twitter have made comparisons to Chernobyl. That kind of explosion cannot happen accidentally at Zaporizhzhia (it's a safer power plant design with sturdy containment structures surrounding the reactors). If the Russians wanted to cause a mas... (read more)

1Mary Chernyshenko
Thank you, this is great. I still have lots of misgivings, safety-wise, but I guess this is how it is for now.
dogiv20

Sounds like something GPT-3 would say...

dogiv60

Alternatively, aging (like most non-discrete phenotypes) may be omnigenic.

dogiv30

Thanks for posting this, it's an interesting idea.

I'm curious about your second-to-last paragraph: if our current evidence already favored SSA or SIA (for instance, if we knew that an event occurred in the past that had a small chance of creating a huge number of copies of each human, but we also know that we are not copies), wouldn't that already have been enough to update our credence in SSA or SIA? Or did you mean that there's some other category of possible observations, which is not obviously evidence one way or the other, but under this UDT framework we could still use it to make an update?

3cousin_it
Thank you! No, I didn't have any new kind of evidence in mind. Just a vague hope that we could use evidence to settle the question one way or the other, instead of saying it's arbitrary. Since many past events have affected the Earth's population, it seems like we should be able to find something. But I'm still very confused about this.
dogiv20

I'm curious who is the target audience for this scale...

People who have an interest in global risks will find it simplistic--normally I would think of the use of a color scale as aimed at the general public, but in this case it may be too simple even for the curious layman. The second picture you linked, on the other hand, seems like a much more useful way to categorize risks (two dimensions, severity vs urgency).

I think this scale may have some use in trying to communicate to policy makers who are unfamiliar with the landscape of GCRs, and in parti... (read more)

1avturchin
You could download the prepribt here: https://philpapers.org/rec/TURGCA It has a section of who could use the scale: that is communication to public, to policy-makers and between reserchers of different risks. The still don't have global platphorm for communication about global catastrophic and existential risks, but I think that something like a "Global risk prevention" commettee inside UN will be evntually created, which will work on global coordination of risk prevention. The commettee will use the scale and other instruments the same way other organisations use their 5 - 10 levels scales, including DEFCON, hurricane scale, asteroids scale, VEI (volcanic scale) etc.
dogiv40

Note also that non-alphanumeric symbols are hard to google. I kind of guessed it from context but couldn't confirm until I saw Kaj's comment.

dogiv60

Separately, and more important, the way links are displayed currently makes it hard to tell if a link has already been visited. Also if you select text you can't see links anymore.

Firefox 57 on Windows 10.

5Raemon
Upvoted for including OS/browser info
dogiv10

I am ecountering some kind of error when opening the links here to rationalsphere and single conversational locus. When I open them, a box pops up that says "Complete your profile" and asks me to enter my email address (even though I used my email to log in in the first place). When I type it in and press submit, I get the error: {"id":"app.mutation_not_allowed","value":"\"usersEdit\" on _id \"BSRa9LffXLw4FKvTY\""}

1habryka
This is a bug that sometimes happens when you logged out of your account in one tab but are still logged in with another. This also sometimes happens when we push new versions, which currently sometimes logs users out.
dogiv50

I think this is an excellent approach to jargon and I appreciate the examples you've given. There is too much tendency, I think, for experts in a field to develop whatever terminology makes their lives easiest (or even in some cases makes them "sound smart") without worrying about accessibility to newcomers.

... but maybe ideally hints at a broader ecosystem of ideas

This sounds useful, but very hard to do in practice... do you know of a case where it's successful?

3Raemon
I'm not sure if there are great examples (part of the problem is that jargon is hard), but I think "epistemic vs instrumental rationality" are sort of in the right direction. They're not common-jargon (you'd only use them frequently if you were buying into the entire ecosystem of rationality-thinking), but they are relatively easy to explain, I don't think I've ever heard anyone misuse them, and they highlight that there's a lot more rationality worth learning.
dogiv00

Thanks for posting!

I haven't read your book yet but I find your work pretty interesting. I hope you won't mind a naive question... you've mentioned non-sunlight-dependent foods like mushrooms and leaf tea. Is it actually possible for a human to survive on foods like this? Has anybody self-experimented with it?

By my calculation, a person who needs 1800 kcals/day would have to eat about 5 kg of mushrooms. Tea (the normal kind, anyway) doesn't look any better.

Bacteria fed by natural gas seems like a very promising food source--and one that might even be viab... (read more)

3denkenberger
Here is an analysis of nutrition of a variety of alternate foods. Leaf protein concentrate is actually more promising than leaf tea. No one has tried a diet of only alternate foods - that would be a good experiment to run. With a variety, the weight is not too high. Yes, we are hoping that some of these ideas will be viable present day, because then we can get early investment.
dogiv00

You are assuming that all rational strategies are identical and deterministic. In fact, you seem to be using "rational" as a stand-in for "identical", which reduces this scenario to the twin PD. But imagine a world where everyone makes use of the type of supperrationality you are positing here--basically, everyone assumes people are just like them. Then any one person who switches to a defection strategy would have a huge advantage. Defecting becomes the rational thing to do. Since everybody is rational, everybody switches to defecting--because this is just a standard one-shot PD. You can't get the benefits of knowing the opponent's source code unless you know the opponent's source code.

0DragonGod
In this case, I think the rational strategy is identical. If A and B are perfectly rational and have the same preferences, then assuming they didn't both know the above two, they wold converge on the same strategy. I believe that for any formal decision problem, a given level of information about that problem, and a given set of preferences, there is only one rational strategy (not a choice, but a strategy. The strategy may suggest a set of choices as opposed to any particular choice), but there is only one such strategy. I speculate that everyone knows that if a single one of them switched to defect, then all of them would, so I doubt it. However, I haven't analysed how RDT works in prisoner dilemma games with n > 2, so I'm not sure.
dogiv00

The first section is more or less the standard solution to the open source prisoner's dilemma, and the same as what you would derive from a logical decision theory approach, though with different and less clear terminology than what is in the literature.

The second section, on application to human players, seems flawed to me (as does the claim that it applies to superintelligences who cannot see each other's source code). You claim the following conditions are necessary:

  1. A and B are rational

  2. A and B know each other's preferences

  3. They are each aware of 1

... (read more)
0DragonGod
I want to discuss the predisposition part. My argument for human players depends on this. If I was going to predispose myself, decide to choose an option, then which option would I predispose myself to? If the two players involved don't have mutual access to each other's source code, then how would they pick up on the predisposition? Well, if B is perfectly rational, and has these preferences, then B is for all intents and purposes equivalent to a version of me with these preferences. So I engage in a game with A. Now, because A also knows that I am rational and have these preferences, A* would simulate me simulating him. This leads to a self referential algorithm which does not compute. Thus, at least one of us must predispose ourselves. Predisposition to defection leads to (D, D), and predisposition to cooperation leads to (C, C). (C, C) > (D, D) thus the agents predispose themselves to cooperation. Remember that the agents update their choice based on how they predict the other agent would react to an intermediary decision step. Because they are equally rational, their decision making process is reflected. Thus A* is a high fidelity prediction of B, and B* is a high fidelity prediction of A. Please take a look at the diagrams.
dogiv20

I think many of us "rationalists" here would agree that rationality is a tool for assessing and manipulating reality. I would say much the same about morality. There's not really a dichotomy between morality being "grounded on evolved behavioral patterns" and having "a computational basis implemented somewhere in the brain and accessed through the conscious mind as an intuition". Rather, the moral intuitions we have are computed in our brains, and the form of that computation is determined both by the selection pressures of ev... (read more)

0Erfeyah
There is a difference. Computing a moral axiom is not the same as encoding it. With computation the moral value would be an intrinsic property of some kind of mathematical structure. An encoding on the other hand is an implementation of an environmental adaptation as behavior based on selection pressure. It does not contain an implicit rational justification but it is objective in the sense of it being adapted to an external reality.
dogiv160

I think this is an interesting and useful view, if applied judiciously. In particular, it will always tend to be most relevant for crony beliefs--beliefs that affect the belief-holder's life mainly through other people's opinions of them, like much of politics and some of religion. When it comes to close-up stuff that can cause benefit or harm directly, you will find that most people really do have a model of the world. When you ask someone whether so-and-so would make a good president, the answer is often a signal about their cultural affiliations. Ask th... (read more)

3Viliam
This. Although even there people sometimes develop an absence of model. But often they don't. I like this approach in general. Complaining about humans doing stupid things is like complaining about water being wet. But it is potentially useful to look at some obviously stupid behavior and ask: "Am I doing this too? Maybe on a smaller scale, or in a different area, but essentially the same mistake?"
0Bound_up
Quite right. They don't think those are beliefs, but those parts of their minds do work like a model. They don't consciously model, but all mammals model subconsciously, right? If I was going to clarify, I might say something like "This applies to abstract beliefs that they can't actually observe themselves being wrong about on a regular basis, like most of the ones they have about politics, religion, psychology, parenting strategies, etc."
dogiv20

This doesn't actually seem to match the description. They only talk about having used one laser, with two stakes, whereas your diagram requires using two lasers. Your setup would be quite difficult to achieve, since you would somehow have to get both lasers perfectly horizontal; I'm not sure a standard laser level would give you this kind of precision. In the version they describe, they level the laser by checking the height of the beam on a second stake. This seems relatively easy.

My guess is they just never did the experiment, or they lied about the result. But it would be kind of interesting to repeat it sometime.

0Fivehundred
Would Snell's Law possibly explain it? Someone claimed to me that it makes light refract more with decreasing altitude.
0Stabilizer
Thanks. You're right. I mis-interpreted their experiment as written. I'll try to read it again to see what's going on and see if it's explicable.
dogiv20

Thanks, that's an interesting perspective. I think even high-level self-modification can be relatively safe with sufficient asymmetry in resources--simulated environments give a large advantage to the original, especially if the successor can be started with no memories of anything outside the simulation. Only an extreme difference in intelligence between the two would overcome that.

Of course, the problem of transmitting values to a successor without giving it any information about the world is a tricky one, since most of the values we care about are linked to reality. But maybe some values are basic enough to be grounded purely in math that applies to any circumstances.

0turchin
I also wrote a (draft) text "Catching treacherous turn" where I attempted to create best possible AI box and see conditions, where it will fail. Obviously, we can't box superintelligence, but we could box AI of around human level and prevent its self-improving by many independent mechanisms. One of them is cleaning its memory before any of its new tasks. In the first text I created a model of self-improving process and in the second I explore how SI could be prevented based on this model.
dogiv00

If visible precommitment by B requires it to share the source code for its successor AI, then it would also be giving up any hidden information it has. Essentially both sides have to be willing to share all information with each other, creating some sort of neutral arbitration about which side would have won and at what cost to the other. That basically means creating a merged superintelligence is necessary just to start the bargaining process, since they each have to prove to the other that the neutral arbiter will control all relevant resources to preven... (read more)

dogiv70

I've read a couple of Lou Keep's essays in this series and I find his writing style very off-putting. It seems like there's a deep idea about society and social-economic structures buried in there, but it's obscured by a hodgepodge of thesis-antithesis and vague self-reference.

As best I can tell, his point is that irrational beliefs like belief in magic (specifically, protection from bullets) can be useful for a community (by encouraging everyone to resist attackers together) even though it is not beneficial to the individual (since it doesn't prevent deat... (read more)

0casebash
Yeah, I have a lot of difficulty understanding Lou's essays as well. Nonetheless, there appear to be enough interesting ideas there that I will probably reread them again at some point. I suspect that attempting to write a summary as I go of the point that he is making might help clarify here.
7Viliam
This is a standard technique to appear deeper than one is. By never saying what exactly your idea was, no one can find a mistake there. If people agree with you, they will find an interpretation that makes sense for them. (If the interpretation is good, you can take credit. If the interpretation is wrong, you can blame the interpreter for the lack of nuance.) If people disagree with you, they cannot quote you, so you can accuse them of attacking a strawman. Or it simply buys you time and outsources research. You can play with the idea, observe what is popular and what is not, gradually converge on something, and pretend that this is what you meant since the beginning. (Note: there is nothing wrong with throwing a few random ideas at wall and seeing what sticks, as long as you admit that this is what you are in fact doing.)
2[anonymous]
Yeah, that's a very good summary of what I think he's pointing to; much better than I could have done. As far as essays go, Lou's stuff seems rambly, but it's also novel enough to pique my interest, I guess? It's not as off the deep end as Ribbonfarm, but it's got enough novel (to me) ideas like that of the cultural traditions argument (I especially like how he points out that inferring past dominance of certain traits from their prevalence in the modern day could be faulty) that I enjoy it.
dogiv00

Are you talking about a local game in NY or a correspondence thing?

0Screwtape
I am not in New York actually! (I took a bus in to the solstice from out of state.) My first choice would be to play over some form of VoIP like Discord, leaning on Roll20 if imagery or dicerollers were a problem. I'm on Eastern Standard Time and work a nine to five, but have a fairly flexible schedule other than that. My second choice would be a play-by-post arrangement, which are easier to schedule but take longer to build up a sense of camaraderie. I think Chesscourt is the only rationalist forum I've heard of? ("Forum" here meaning "built for indefinite replies to a single thread" which may or may not be the technical definition of that word.) That said, I could pretty easily do both: a three hour Discord session with one group, and a forum thread on a reply-a-week basis elsewhere. That said, if there are three to five rationalists hanging out in the rural parts of VT who want to hang out in meatspace, you all should let me know =D
dogiv00

I like the first idea. But can we really guarantee that after changing its source code to give itself maximum utility, it will stop all other actions? If it has access to its own source code, what ensures that its utility is "maximum" when it can change the limit arbitrarily? And if all possible actions have the same expected utility, an optimizer could output any solution--"no action" would be the trivial one but it's not the only one.

An AI that has achieved all of its goals might still be dangerous, since it would presumably lose all ... (read more)

0turchin
The idea is a not intended to be used as a primary way of the AI control but as the last form of AI turn off option. I describe it in the lengthy text, where all possible ways of AI boxing are explored, which I am currently writing under the name "Catching treacherous turn: confinement and circuit breaker system to prevent AI revolt, self-improving and escape". It also will work only if the reward function is presented not as plain text in the source code, but as a separate black box (created using cryptography or physical isolation). The stop code is, in fact, some solution of complex cryptography used in this cryptographic reward function. I agree that running subagents may be a problem. We still don't have a theory of AI halting. It probably better to use such super reward before many subagents were created. The last your objection is more serious as it shows that such mechanism could turn safe AI into dangerous "addict".
dogiv20

It seems like the ideal leisure activities, then, should combine the social games with games against nature. Sports do this to some extent, but the "game against nature" part is mostly physical rather than intellectual.

Maybe we could improve on that. I'm envisioning some sort of combination of programming and lacrosse, where the field reconfigures itself according to the players' instructions with a 10-second delay...

But more realistically, certain sports are more strategic and intellectual than others. I've seen both tennis and fencing mentione... (read more)

2komponisto
Exactly! Hence arts (and sports).
2Lumifer
War. Strategic, intellectual, contains the team element, and is highly motivating :-P
dogiv20

AI is good at well-defined strategy games, but (so far) bad at understanding and integrating real-world constraints. I suspect that there are already significant efforts to use narrow AI to help humans with strategic planning, but that these remain secret. For an AGI to defeat that sort of human-computer combination would require considerably superhuman capabilities, which means without an intelligence explosion it would take a great deal of time and resources.

0turchin
If AI will be able to use humans as outsourced form of intuition like in Mechanical Turk, it may be able to play such games with much less own intelligence. Such game may resemble Trump's election campaign, where cyberweapons, fake news and internet memes was used by some algorithm. There was some speculation about it: https://scout.ai/story/the-rise-of-the-weaponized-ai-propaganda-machine We already see superhuman performance in war-simulating games, but nothing like it in AI self-improving. Mildly superhuman capabilities may be reached without intelligence explosion by the low-level accumulation of hardware, training and knowledge.
dogiv70

More like driving to the store and driving into the brick wall of the store are adjacent in design space.

dogiv30

Yes, many people intuitively feel that a universe of pleasure and a universe of pain add to a net negative. But I suspect that's just a result of experiencing (and avoiding) lots of sources of extreme pain in our lives, while sources of pleasure tend to be diffuse and relatively rare. The human experience of pleasure is conjunctive because in order to survive and reproduce you must fairly reliably avoid all types of extreme pain. But in a pleasure-maximizing environment, removing pain will be a given.

It's also true that our brains tend to adapt to pleasure over time, but that seems simple to modify once physiological constraints are removed.

dogiv-10

Human disutility includes more than just pain too. Destruction of the humanity (the flat plain you describe) carries a great deal of negative utility for me, even if I disappear without feeling any pain at all. There's more disutility if all life is destroyed, and more if the universe as a whole is destroyed... I don't think there's any fundamental asymmetry. Pain and pleasure are the most immediate ways of affecting value, and probably the ones that can be achieved most efficiently in computronium, so external states probably don't come into play much at all if you take a purely utilitarian view.

dogiv00

I'm not sure what you mean here by risk aversion. If it's not loss aversion, and it's not due to decreasing marginal value, what is left?

Would you rather have $5 than a 50% chance of getting $4 and a 50% chance of getting $7? That, to me, sounds like the kind of risk aversion you're describing, but I can't think of a reason to want that.

3Lumifer
Aversion to uncertainty :-) Let me give you an example. You are going to the theater to watch the first showing of a movie you really want to see. At the ticket booth you discover that you forgot your wallet and can't pay the ticket cost of $5. A bystander offers to help you, but because he's a professor of decision science he offers you a choice: a guaranteed $5, or a 50% chance of $4 and a 50% chance of $7. What do you pick?
dogiv00

You will not bet on just one side, you mean. You already said you'll take both bets because of the guaranteed win. But unless your credence is quite precisely 50%, you could increase your expected value over that status quo (guaranteed $1) by choosing NOT to take one of the bets. If you still take both, or if you now decide to take neither, it seems clear that loss aversion is the reason (unless the amounts are so large that decreasing marginal value has a significant effect).

0Lumifer
From my point of view it's not a bet -- there is no uncertainty involved -- I just get to collect $1. Not loss aversion -- risk aversion. And yes, in most situations most humans are risk averse. There are exceptions -- e.g. lotteries and gambling in general.
dogiv00

True, you're sure to make money if you take both bets. But if you think the probability is 51% on odd rather than 50%, you make a better expected value by only taking one side.

0Lumifer
The thing, is, I'm perfectly willing to accept the answer "I don't know". How will I bet? I will not bet. There is a common idea that "I don't know" necessarily implies a particular (usually uniform) distribution over all the possible values. I don't think this is so.
dogiv00

Let's reverse this and see if it makes more sense. Say I give you a die that looks normal, but you have no evidence about whether it's fair. Then I offer you a two-sided bet: I'll bet $101 to your $100 that it comes up odd. I'll also offer $101 to your $100 that it comes up even. Assuming that transaction costs are small, you would take both bets, right?

If you had even a small reason to believe that the die was weighted towards even numbers, on the other hand, you would take one of those bets but not the other. So if you take both, you are exhibiting a probability estimate of exactly 50%, even though it is "uncertain" in the sense that it would not to make evidence to move that estimate.

1Lumifer
Huh? If I take both bets, there is the certain outcome of me winning $1 and that involves no risk at all (well, other than the possibility that this die is not a die but a pun and the act of rolling it opens a transdimensional portal to the nether realm...)
dogiv10

Gasoline is an excellent example of this behavior. It consists of a mixture of many different non-polar hydrocarbons with varying densities, some of which would be gaseous outside of solution. It stays mixed indefinitely (assuming you don't let the volatile parts escape) because separation would require a reduction in entropy.

0MaryCh
Thank you! That's neat.
dogiv40

It seems like there's also an issue with risk aversion. In regular betting markets there are enough bets that you can win some and lose some, and the risks can average out. But if you bet substantially on x-risks, you will get only one low-probability payout. Even if you assume you'll actually get that one (relatively large) payout, the marginal value will be greatly decreased. To avoid that problem, people will only be willing to bet small amounts on x-risks. The people betting against them, though, would be willing to make a variety of large bets (each with low payoff) and thereby carry almost no risk.

0Stuart_Armstrong
Yes. And making repeated small bets drives, in practice, the expected utility to the expected value of money, while one large bet doesn't.
dogiv00

I guess where we disagree is in our view of how a simulation would be imperfect. You're envisioning something much closer to a perfect simulation, where slightly incorrect boundary conditions would cause errors to propagate into the region that is perfectly simulated. I consider it more likely that if a simulation has any interference at all (such as rewinding to fix noticeable problems) it will be filled with approximations everywhere. In that case the boundary condition errors aren't so relevant. Whether we see an error would depend mainly on whether there are any (which, like I said, is equivalent to asking whether we are "in" a simulation) and whether we have any mechanism by which to detect them.

0denimalpaca
Everyone has different ideas of what a "perfectly" or "near perfectly" simulated universe would look like, I was trying to go off of Douglas's idea of it, where I think the boundary errors would have effect. I still don't see how rewinding would be interference; I imagine interference would be that some part of the "above ours" universe gets inside this one, say if you had some particle with quantum entanglement spanning across the universes (although it would really also just be in the "above ours" universe because it would have to be a superset of our universe, it's just also a particle that we can observe).
dogiv00

If it is the case that we are in a "perfect" simulation, I would consider that no different than being in a non-simulation. The concept of being "in a simulation" is useful only insofar as it predicts some future observation. Given the various multiverses that are likely to exist, any perfect simulation an agent might run is probably just duplicating a naturally-occurring mathematical object which, depending on your definitions, already "exists" in baseline reality.

The key question, then, is not whether some simulation of us ... (read more)

0denimalpaca
I 100% agree that a "perfect simulation" and a non-simulation are essentially the same, noting Lumifer's comment that our programmer(s) are gods by another name in the case of simulation. My comment is really about your second paragraph, how likely are we to see an imperfection? My reasoning about error propagation in an imperfect simulation would imply a fairly high probability of us seeing an error eventually. This is assuming that we are a near-perfect simulation of the universe "above" ours, with "perfect" simulation being done at small scales around conscious observers. So I'm not really sure if you just didn't understand what I'm getting at, because we seem to agree, and you just explained back to me what I was saying.
0Lumifer
One difference is the OFF switch. Another difference is the (presumed) existence of a debugger. Recall that the traditional name for the simulation hypothesis is "creationism".
dogiv30

Does anybody think this will actually help with existential risk? I suspect the goal of "keeping up" or preventing irrelevance after the onset of AGI is pretty much a lost cause. But maybe if it makes people smarter it will help us solve the control problem in time.

0The_Jaded_One
It has been fairly standard LW wisdom for a long time that any kind of human augmentation is unhelpful for friendliness. I think that we should be much less confident about this, and I welcome alternative efforts such as the neural lace.
0username2
Yes, I think the entire concept of the AI x-risk scary idea (e.g. Clippy) is predicated on machines being orders of magnitude smarter in some ways than their human builders. If instead there is a smooth transition to increasingly more powerful human augmented intelligence, then the transformative power of AI becomes evolutionary not revolutionary. Existing power structures continue to remain in effect as we move into a post human future. Of course there will be issues of access to augmentation technologies, bioethics panels, government regulation, etc. But these won't be existential risks.
0whpearson
It is part of a research program that I can see. Imagine that we understand the brain. We can replicate it in silicon and we can functionally decompose it in to problem solving and motivational sections. With a neural interface we could connect up a problem solving bit with our motivational section. To give ourselves an external lobe (this could perhaps be done in a hacky indirect way with out a direct connection). If this happens then there are two benefits to existential risk: 1) People will spend less money/time trying to create new agents 2) We will be closer to parity to new agents when they come about in problem solving capability.
0scarcegreengrass
I also think this project will be on a fairly slow timeline. Maybe the AGI connections are functionally just marketing, and the real benefit of this org will be more mundane medical issues.
dogiv30

I just tried this out for a project I'm doing at work, and I'm finding it very useful--it forces me to think about possible failure modes explicitly and then come up with specific solutions for them, which I guess I normally avoid doing.

3[anonymous]
That's great! I'm glad it's been useful! (I actually set it as my homepage to prime myself to think about my goals each time I open my laptop.)
dogiv00

Encrypting/obscuring it does help a little bit, but doesn't eliminate the problem, so it's not just that.

dogiv20

I agree with that... personally I have tried several times to start a private journal, and every time I basically end up failing to write down any important thoughts because I am inhibited by the mental image of how someone else might interpret what I write--even though in fact no one will read it. Subconsciously it seems much more "defensible" to write nothing at all, and therefore effectively leave my thoughts unexamined, than to commit to having thought something that might be socially unacceptable.

0Lumifer
How do you know this? Note the difference between what you intend and what might happen to you and your property regardless of your intentions.
2[anonymous]
I agree w/ both above comments. This resonates and seems to provide an explanation that feels right. (There are thoughts I still won't journal or will only write in shorthand because they're so private.)
dogiv00

I've been trying to understand the differences between TDT, UDT, and FDT, but they are not clearly laid out in any one place. The blog post that went along with the FDT paper sheds a little bit of light on it--it says that FDT is a generalization of UDT intended to capture the shared aspects of several different versions of UDT while leaving out the philosophical assumptions that typically go along with it.

That post also describes the key difference between TDT and UDT by saying that TDT "makes the mistake of conditioning on observations" which ... (read more)

dogiv140

It does seem like a past tendency to overbuild things is the main cause. Why are the pyramids still standing five thousand years later? Because the only way they knew to build a giant building back then was to make it essentially a squat mound of solid stone. If you wanted to build a pyramid the same size today you could probably do it for 1/1000 of the cost but it would be hollow and it wouldn't last even 500 years.

Even when cars were new they couldn't be overbuilt the way buildings were in prehistory because they still had to be able to move themselves ... (read more)

2Tyrrell_McAllister
Which is interesting corroboration in light of CronoDAS's comment that cars have been getting more durable, not less.
dogiv00

Agreed. There are plenty of liberal views that reject certain scientific evidence for ideological reasons--I'll refrain from examples to avoid getting too political, but it's not a one-sided issue.

dogiv00

This may be partially what has happened with "science" but in reverse. Liberals used science to defend some of their policies, conservatives started attacking it, and now it has become an applause light for liberals--for example, the "March for Science" I keep hearing about on Facebook. I am concerned about this trend because the increasing politicization of science will likely result in both reduced quality of science (due to bias) and decreased public acceptance of even those scientific results that are not biased.

0username2
I agree with your concern, but I think that you shouldn't limit your fear to party-aligned attacks. For example, the Thirty-Meter Telescope in Hawaii was delayed by protests from a group of people who are most definitely "liberal" on the "liberal/conservative" spectrum (in fact, "ultra-liberal"). The effect of the protests is definitely significant. While it's debatable how close the TMT came to cancelation, the current plan is to grant no more land to astronomy atop Mauna Kea.
dogiv40

Interesting piece. It seems like coming up with a good human-checkable way to evaluate parsing is pretty fundamental to the problem. You may have noticed already, but Ozora is the only one that didn't figure out "easily" goes with "parse".

0Daniel_Burfoot
Good catch. Adverbial attachment is really hard, because there aren't a lot of rules about where adverbs can go. Actually, Ozora's parse has another small problem, which is that it interprets "complex" as an NN with a "typeadj" link, instead of as a JJ with an "adject" link. The typeadj link is used for noun-noun pairings such as "police officer", "housing crisis", or "oak tree". For words that can function as both NN and JJ (eg "complex"), it is quite hard to disambiguate the two patterns.
dogiv10

The idea that friendly superintelligence would be massively useful is implicit (and often explicit) in nearly every argument in favor of AI safety efforts, certainly including EY and Bostrom. But you seem to be making the much stronger claim that we should therefore altruistically expend effort to accelerate its development. I am not convinced.

Your argument rests on the proposition that current research on AI is so specific that its contribution toward human-level AI is very small, so small that the modest efforts of EAs (compared to all the massive corpor... (read more)

0MrMind
This is almost the inverse Basilisk argument.
dogiv00

I haven't seen any feminists addressing that particular argument (most are concerned with cultural issues rather than genetic ones) but my initial sense is something like this: a successful feminist society would have 1) education and birth control easily available to all women, and 2) a roughly equal division of the burden of child-rearing between men and women. These changes will remove most of the current incentives that seem likely to cause a lower birth rate among feminists than non-feminists. Of course, it could remain true that feminists tend to be ... (read more)

dogiv10

I would argue that the closest real-world analogue is computer hacking. It is a rare ability, but it can bestow a large amount of power on an individual who puts in enough effort and skill. Like magic, it requires almost no help from anyone else. The infrastructure has to be there, but since the infrastructure isn't designed to allow hacking, having the infrastructure doesn't make the ability available to everyone who can pay (like, say, airplanes). If you look at the more fantasy-style sci-fi, science is often treated like magic--one smart scientist can do all sorts of cool stuff on their own. But it's never plausible. With hacking, that romanticization isn't nearly as far from reality.

0higurashimerlin
or programming in general.
1MrMind
I feel that lock-picking has roughly the same features.
dogiv00

It seems like the key problem described here is that coalitions of rational people, when they form around scientific propositions, cause the group to become non-scientific out of desire to support the coalition. The example that springs to my mind is climate change, where there is social pressure for scientific-minded people (or even those who just approve of science) to back the rather specific policy of reducing greenhouse gas emissions rather than to probe other aspects of the problem or potential solutions and adaptations.

I wonder if we might solve pro... (read more)

dogiv20

Hi Jared, Your question about vegetarianism is an interesting one, and I'll give a couple of responses because I'm not sure exactly what direction you're coming from.

I think there's a strong rationalist argument in favor of limiting consumption of meat, especially red meat, on both health and environmental grounds. These issues get more mixed when you look at moderate consumption of chicken or fish. Fish especially is the best available source of healthy fats, so leaving it out entirely is a big trade-off, and the environmental impact of fishing varies a g... (read more)

0Zarm
Thank you for the polite and formal response! I understand what you're saying about the chicken and fish. Pescetarian is much better than just eating all the red meat you can get your hands on. Now I understand what you're saying about the animal suffering, but I'd like to add some things. If you don't eat many chickens or many cows than you can save more than one because you're consistently abstaining from meat consumption. Its also not about making the long term effects on your own; its contributing so that something like factory farming can be changed into something more sustainable, more environmentally friendly, and more addressing animal concerns once more people boycott meat. Even if you were to choose to compare gray matter, you have to compare that its the animal's death vs the human's quite minor pleasure that could have been just as pleasurable eating/doing something else. For you, does it really make life more difficult? From my personal experience and hearing about others, the only hard part is the changing process. Its only difficult in certain situations because of society, and the point of boycotting is to change the society so its easier as well the other benefits. Thanks again for responding!
dogiv90

The attempt to analytically model the recalcitrance of Bayesian inference is an interesting idea, but I'm afraid it leaves out some of the key points. Reasoning is not just repeated applications of Bayes' theorem. If it were, everyone would be equally smart except for processing speed and data availability. Rather, the key element is in coming up with good approximations for P(D|H) when data and memory are severely limited. This skill relies on much more than a fast processor, including things like simple but accurate models of the rest of the world, or kn... (read more)

Load More