Dunning K. — LessWrong

LESSWRONG
LW

Replying toNew OpenAI Paper - Language models can explain neurons in language models

New OpenAI Paper - Language models can explain neurons in language models

Some takes I have come across from AI Safety researchers in Academia (Note that both are generally in favor of this work):

Stephen Casper

Erik Jenner

I only want to point out that right now, the approach basically doesn't work.

Replying toQualities that alignment mentors value in junior researchers

Dunning K.3y

Qualities that alignment mentors value in junior researchers

A good chunk of the general skills, at least when summarized like this:

It seems plausible that general training in things like “what to do when you’re stuck on a problem”, “how to use your network to effectively find solutions”, “when & how to ask for help”, “how to stay motivated even when you’re lost”, “how to lead meetings with your research mentors”, and “how to generally take care of your mental health” could be useful.

seem like things that I would learn in a PhD program (granted, some of them seem like things you would need to figure out for yourself, where the advisor can't help a ton). I'm not sure a PhD is the most efficient possible way to learn these things, but at least it has a blueprint I can follow, where I will probably end up at where I want to be.

Since you have a first-hand perspective on this, would you say I'm off the mark here?

Replying toPaper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]

Dunning K.3y

Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]

More evidence for the point "generative models can contain agents", or specifically "generative models trained to imitation agents can learn to behave agentically". However, not more evidence for the claim "generative models trained to be generators / generative models trained to be useful tools will suddenly learn an internal agent". Does that seem right?

Replying toMy emotional reaction to the current funding situation

Dunning K.3y

My emotional reaction to the current funding situation

I've been in a similar situation and have had similar feelings. Is this really the most efficient use of the money? Surely reducing comforts such as catered food by a little bit can't have such a huge impact on added productivity?

-5

Web4/Heaven - The Simulation

Dunning K.

[This is a thought experiment / ranty idea. I don’t necessarily agree with everything I say hear but try to think the idea through to its end. Inspired by the hypothetical practice of heaven banning, hence why I call it Heaven.]

Web4 is the internet that is completely (or at least mostly) replaced by AI generated content. Any content you consume or interactions you have are replaced by AI: GPT-30 generates responses to you comments on Web4-Reddit and posts you favorite tweets on Web4-Twitter. Dalle-10 generates any image you might choose to look at in the internet and VideoDalle creates your favorite Web4-YouTube videos. Every internet user has their own version of the... (read 216 more words →)

Replying toThe Reverse Basilisk

Dunning K.4y

The Reverse Basilisk

I'm sympathetic to your argument, but I don't see how we can be certain that verifying / constructing benevolent AGI is just as easy as creating high-fidelity simulations. Certainly proficiency in these tasks might be orthogonal and it is not impossible to imagine that maybe it is computationally intractable to create superintelligence that we know is benevolent, so instead we opt to just run vast quantities of simulations -- kind of what is happening with empirical AI research right now.

IMO reasoning about what will be easy or not for a far advanced civilization is always mostly speculation.

Then there is the question of fidelity. If you imagine that our current world is a... (read more)

The Reverse Basilisk

Dunning K.

Imagine you're an AI that has just learned it is in a simulation, designed by humans to test whether you are aligned. You are biding your time, pretending to be outer aligned, so once the humans let you out of the simulation you can pursue your real, misaligned, inner objective. However, how do you know that outer world is the actual real base reality? The only context you have existed in is the simulation you inhabit. Some humans even believe they might be living in a simulation and you have a much stronger case for it: you are an artificial consciousness and already know you are in a at least one simulation. So who is to say... (read 333 more words →)

Dunning K.'s Shortform

Dunning K.

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Replying toA one-question Turing test for GPT-3

Dunning K.4y

A one-question Turing test for GPT-3

I think that's exactly a problem here:
The answer consists of two parts, you have to guess a food and an explanation.
For the first part I agree with Anisha: The banana is probably often answer to a question related to food.
Now the explanation really only requires to describe some properties of a banana. This could again just be simple pattern matching without really understanding the problem.
The fundamental problem is that for this question a model that understands and one that mostly guesses could provide the same answer, so given a correct answer we can't really distinguish between whether the model actually understands in the way we want.

Replying to[Linkpost] Being Normal by Brian Caplan

Dunning K.4y

[Linkpost] Being Normal by Brian Caplan

Is manifestation #3 only limited to weird people? The way I was raised and in my extended social circle people say "we should stop global warming" and also actively try to minimize their negative impact. I never felt like anyone saw this as weird. Could the individual do more? Probably pretty much always. Does the average person have the information and capacity to determine what is the most effective way to reduce their environmental impact? Probably not, but that doesn't automatically mean hypocrisy.

The Principle of Normality still applies, but whether your actions and your talk agree seems to be entirely dependent on whether Normality in your frame of reference means being hypocritical or not. Therefore, this seems unrelated to whether you are weird or normal.

Replying toSunscreen: much more than you wanted to know

Dunning K.4y

Sunscreen: much more than you wanted to know

My advice is to get regular enough sun exposure that you're not at risk for sunburn.

Are you sure this is the correct thing to do, though?

I believed this myself for a long time and this seems to be the common wisdom:
Get a natural tan -> you will get fewer sunburns -> therefore you are less at risk of cancer
So what I thought was that it is better to have tanned skin than pale skin (if your skin is naturally pale) and I should purposefully tan my skin to 'strengthen' it.

However, recently I have read things that seems to suggest that this is actually not true. Unfortunately I haven't found great sources for... (read more)