Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: torekp 28 May 2017 12:12:42AM 0 points [-]

I think the "non universal optimizer" point is crucial; that really does seem to be a weakness in many of the canonical arguments. And as you point out elsewhere, humans don't seem to be universal optimizers either. What is needed from my epistemic vantage point is either a good argument that the best AGI architectures (best for accomplishing the multi-decadal economic goals of AI builders) will turn out to be close approximations to such optimizers, or else some good evidence of the promise and pitfalls of more likely architectures.

Needless to say, that there are bad arguments for X does not constitute evidence against X.

Comment author: Vaniver 28 May 2017 03:09:18AM 0 points [-]

I think the "non universal optimizer" point is crucial; that really does seem to be a weakness in many of the canonical arguments. And as you point out elsewhere, humans don't seem to be universal optimizers either.

Do you think there's "human risk," in the sense that giving a human power might lead to bad outcomes? If so, then why wouldn't the same apply to AIs that aren't universal optimizers?

It seems to me that one could argue that humans have various negative drives, that we could just not program into the AI, but I think this misses several important points. For example, one negative behavior humans do is 'game the system,' where they ignore the spirit of regulations while following their letter, or use unintended techniques to get high scores. But it seems difficult to build a system that can do any better than its training data without having it fall prey to 'gaming the system.' One needs to not just convey the goal in terms of rewards, but the full concept around what's desired and what's not desired.

Comment author: username2 27 May 2017 03:30:52PM 1 point [-]

Your argument is modeling AI as a universal optimizer. Actual AGI research (see the proceedings of the AGI conference series) is concerning architectures that are not simple Bayesian optimizers. So it is not at all clear to me that your arguments regarding optimizers transfers to e.g. an OpenCog or MicroPsi or LIDA or Sigma AI. That's why I'm insisting on demonstration using one or more of these practical architectures.

Comment author: Vaniver 28 May 2017 03:04:58AM 0 points [-]

Your argument is modeling AI as a universal optimizer.

I agree that AI that is an universal optimizer will be more likely to be in this camp (especially the 'take control') bit, but I think that isn't necessary. Like, if you put an AI in charge of driving all humans around the country, and the way it's incentivized doesn't accurately reflect what you want, then there's risk of AI misbehavior. The faulty reward functions post above is about an actual AI trained used modern techniques on a simple task that isn't anywhere near a universal optimizer.

The argument that I don't think you buy (but please correct me if I'm wrong) is something like "errors in small narrow settings, like an RL agent maximizing the score instead of maximizing winning the race, suggest that errors are possible in large general settings." There's a further elaboration that goes like "the more computationally powerful the agent, and the larger the possible action space, the harder it is to verify that the agent will not misbehave."

I'm not familiar enough with reports of OpenCog and others in the wild to point at problems that have already manifested; there are a handful of old famous ones with Eurisko. But it should at least be clear that those are vulnerable to adversarial training, right? (That is, if you trained LIDA to minimize some score, or to mimic some harmful behavior, it would do so.) Then the question becomes if you'll ever do that on accident while doing something else deliberately. (Obviously this isn't the only way for things to go wrong, but it seems like a decent path for an existence proof.)

Comment author: John_Maxwell_IV 27 May 2017 10:20:48PM *  9 points [-]

I'm the person who advocated most strongly for getting the downvote disabled, and I share some of 18239018038528017428's skepticism about the community in the Bay Area, but I strongly agree with Val's comment. There are already a ton of case studies on the internet in how fragile good conversational norms are. I'm going to email Vaniver and encourage him to delete or edit the vitriol out of comments from 18239018038528017428.

(Also ditto everything Val said about not replying to 18239018038528017428)

Comment author: Vaniver 28 May 2017 12:19:57AM 4 points [-]

I'm going to email Vaniver and encourage him to delete or edit the vitriol out of comments from 18239018038528017428.

Thanks for that; I had already noticed this thread but a policy of reporting things is often helpful. It seemed like Duncan was handling himself well, and that leaving this up was better than censoring it. It seems easier for people to judge the screed fairly with the author's original tone, and so just editing out the vitriol seems problematic.

With the new site, we expect to have mod tools that will be helpful here, like downvoting making this invisible-by-default, to ip-banning and other things to make creating a different throwaway account difficult.

Comment author: Lumifer 26 May 2017 02:20:29AM 2 points [-]

For the record: not for me. At all.

Comment author: Vaniver 27 May 2017 07:08:35AM 6 points [-]

I am Jack's complete lack of surprise.

Comment author: username2 20 May 2017 12:23:46PM *  2 points [-]

Ok so I'm in the target audience for this. I'm an AI researcher that doesn't take AI risk seriously and doesn't understand the obsession this site has with AI x-risk. But the thing is I've read all the arguments here and I find them unconvincing. They demonstrate a lack of rigor and a naïve under appreciation of the difficulty of making anything work in production at all, much less out smart the human race.

If you want AI people to take you seriously, don't just throw more verbiage at them. There is enough of that already. Show them working code. Not friendly AI code -- they don't give a damn about that -- but an actual evil AI that could conceivably have been created by accident and actually have cataclysmic consequences. Because from where I sit that is a unicorn, and I stopped believing in unicorns a long time ago.

Comment author: Vaniver 27 May 2017 06:58:26AM *  0 points [-]

https://blog.openai.com/faulty-reward-functions/

[edit]This probably deserves a longer response. From my perspective, all of the pieces of the argument for AI risk exist individually, but don't yet exist in combination. (If all of the pieces existed in combination, we'd already be dead.) And so when someone says "show me the potential risk," it's unclear which piece they don't believe in yet, or which combination they think won't work.

That is, it seems to me that if you believe 1) AIs will take actions that score well on their reward functions, 2) reward functions might not capture their programmer's true intentions, and 3) AI systems may be given or take control over important systems, then you have enough pieces to include that there is a meaningful risk of adversarial AI with control over important systems. So it seems like you could object to any of 1, 2, or 3, or you could object to the claim that their combination implies that conclusion, or you could object to the implication that this claim is a fair statement of AI risk.

Comment author: Duncan_Sabien 26 May 2017 10:59:54AM *  7 points [-]

I think the main issue here is culture. Like, I agree with you that I think most members of the rationalsphere wouldn't do well in a military bootcamp, and I think this suggests a failing of the rationalist community—a pendulum that swung too far, and has weakened people in a way that's probably better than the previous/alternative weakness, but still isn't great and shouldn't be lauded. I, at least, would do fine in a military bootcamp. So, I suspect, would the rationalists I actually admire (Nate S, Anna S, Eli T, Alex R, etc). I suspect Eliezer wouldn't join a military bootcamp, but conditional on him having chosen to do so, I suspect he'd do quite well, also. There's something in there about being able to draw on a bank of strength/go negative temporarily/have meta-level trust that you can pull through/not confuse pain with damage/not be cut off from the whole hemisphere of strategies that require some amount of battering.

It makes sense to me that our community's allergic to it—many people entered into such contexts before they were ready, or with too little information, or under circumstances where the damage was real and extreme. But I think "AVOID AT ALL COSTS! RED FLAG! DEONTOLOGICAL REJECTION!" is the wrong lesson to take from it, and I think our community is closer to that than it is to a healthy, carefully considered balance.

Similarly, I think the people-being-unreliable thing is a bullshit side effect/artifact of people correctly identifying flexibility and sensitivity-to-fluctuating-motivation as things worth prioritizing, but incorrectly weighting the actual costs of making them the TOP priorities. I think the current state of the rationalist community is one that fetishizes freedom of movement and sacrifices all sorts of long-term, increasing-marginal-returns sorts of gains, and that a few years from now, the pendulum will swing again and people will be doing it less wrong and will be slightly embarrassed about this phase.

(I'm quite emphatic about this one. Of all the things rationalists do, this one smacks the most of a sort of self-serving, short-sighted immaturity, the exact reason why we have the phrase "letting the perfect be the enemy of the good.")

I do think Problem 4 can probably be solved incrementally/with a smaller intervention, but when I was considering founding a house, one of my thoughts was "Okay, good—in addition to all the other reasons to do this, it'll give me a context to really turn a bazooka on that one pet peeve."

Comment author: Vaniver 26 May 2017 06:32:36PM 4 points [-]

I suspect Eliezer wouldn't join a military bootcamp, but conditional on him having chosen to do so, I suspect he'd do quite well, also.

Eliezer wasn't able to complete high school, for what I suspect are related reasons. (The sleep thing may have contributed, but I think it was overdetermined.)

I think I would have been extremely miserable if I had gone through boot camp at 18; I think I would have been able to bear going through it by ~25.

Comment author: Lumifer 23 May 2017 07:19:10PM 0 points [-]

A point for LW 2.0: don't be vulnerable to a spam-vomit script attack (e.g. by using posting-rate caps for new accounts).

Comment author: Vaniver 25 May 2017 07:04:24PM 0 points [-]

This came up in our last meeting, as you might imagine. We already had a 10 minute rate limit implemented for posts and links.

Comment author: Vaniver 25 May 2017 07:02:21PM 0 points [-]

This week's LW2.0 update is late, which is entirely my fault. User interviews had led to a number of improvements to the site, and have made me very glad we're doing a slow rollout.

Last weekend was the CFAR Hack Day, where a team added RSS support to LW2.0. That is, we could set up the Yvain account so that it automatically creates linkposts from a particular RSS feed. Given the potential for spam, this seems like a thing that only admins should be able to add to accounts, but it means we have a complete replacement and integration for the Recent on Rationality Blogs sidebar. We also have plans to include some sort of exclusionary feature. (That is, it'd be convenient for Scott if he could add a 'no-lw' tag or something that causes a post to not get automatically linkposted.)

Comment author: Raemon 24 May 2017 02:16:13PM 13 points [-]

I don't think it would have made sense to condense the links (AFAICT they aren't very thematically connected) but I would say:

a) posting 5 things in a row feels spammy, I'd personally have waited at least a day in between them. I realize you're cross-posting from EA and they're already written but it's still good form to wait)

b) when posting a link post a good practice is to include a comment that explains some context for the link.

Comment author: Vaniver 25 May 2017 06:58:28PM 1 point [-]

Endorsed.

Comment author: Lumifer 16 May 2017 05:50:16PM *  0 points [-]

a value-provider not only gets tokens to spend, but also them having more tokens means that everyone else is more sensitive to their desires

This is a good thing, since you do want to incentivize people to provide value.

I also don't know about "everyone". If you are a baker selling loaves of bread for $1, there is no reason to care more about billionaire Alice than about working-stiff Bob if both happen to be your customers. Alice still can eat only one loaf a day so her billions are irrelevant to you.

distribution for income

Wealth distributions in societies tend to be power-law distributions and income is basically the first derivative of wealth.

Comment author: Vaniver 19 May 2017 09:16:59PM 0 points [-]

I also don't know about "everyone".

This is a component of the information conveyed by prices, which everyone is sensitive to.

Wealth distributions in societies tend to be power-law distributions and income is basically the first derivative of wealth.

Only for the rentier class. A fit of real-world income distributions to a combination of the Boltzmann-Gibbs for the bulk and then a power law for the top seems to perform better, because it separates the two classes.

View more: Next