User Comment Replies

Jan Rzymkowski5y40

There a huge leap between a procedure allowing a predictive model to iteratively decrease False Positive Rate and having an AGI.

2Past Account5y

[Deleted]

What specific dangers arise when asking GPT-N to write an Alignment Forum post?

Jan Rzymkowski5y10

Upon reflection, you're right that it won't be maximizing response per se.

But as we get deeper it's not so straightforward. GTP-3 models can be trained to minimize prediction loss (or, plainly speaking, to simply predict more accurately) on many different tasks, which usually are very simply stated (eg. choose a word that would fill the blank).

But we end up with people taking models trained thusly and use them to generate a long texts based on some primer. And yes, in most cases such abuse of the model will end up with text that is simply co... (read more)

What specific dangers arise when asking GPT-N to write an Alignment Forum post?

Answer by Jan RzymkowskiJul 31, 202010

As far as I understand GPT-N it's not very agent-like (it doesn't perform me vs environment abstraction and doesn't look for ways to transform its perceived environment to satisfy some utility function). I wouldn't expect it to "scheme" against people since it lacks any concept of "affecting its environment".

However it seems likely that GTP-N can perfect the skill of crowd-pleasing (we already see that; we're constantly amazed by it, despite little meaning of created texts). It can precisely modulate it's tone and identify the talking points that get the m... (read more)

7TurnTrout5y

But GPT-3 is only trained to minimize prediction loss, not to maximize response. GPT-N may be able to crowd-please if it's trained on approval, but I don't think that's what's currently happening.

Being the (Pareto) Best in the World

Jan Rzymkowski6y20

This analysis seems to quietly assume that various important skills are independent variables and therefor many people in top of their field will neccesserly be average in various other skills (actually, the chart goes even further and assumes that there's universal negative correlation between skills -- I'm not even sure if that's mathematically possible for more than 2 variables).

World's greatest genontologist will probably be very good at statistics and even Ed Jaynes would probably be a above average generontologist just because he can effectively interpret generontology data.

What kind of thing is logic in an ontological sense?

Jan Rzymkowski6y20

Appliability of logic in physical world is sort of a theorem based on the laws of physics (mostly more metaphysical and less technical like the persistence of objects, that themselves as theorems of the basic laws of physics) and the laws governing the process of formulating atomic statements based on the observations.

At the same time we need to be careful as we can easily fall into the trap of unfalsifiability -- when the predictions of logic fail, we're used to say that the problem was with our atomic statements.

That's just the sketch of the full explanation of the topic, which would require at least a chapter.

LESSWRONG
LW

All of Jan Rzymkowski's Comments + Replies