All of Nnotm's Comments + Replies

I think there was a post/short-story on lesswrong a few months ago about a future language model becoming an ASI because someone asked it to pretend it was an ASI agent and it correctly predicted the next tokens, or something like that. Anyone know what that post was?

9gwern3y

https://www.lesswrong.com/posts/a5e9arCnbDac9Doig/it-looks-like-you-re-trying-to-take-over-the-world

Nnotm3y50

Second try: When looking at scatterplots of any 3 out of 5 of those dimensions and interpreting each 5-tuple of numbers as one point, you can see the same structures that are visible in the 2d plot, the parabola and a line - though the line becomes a plane if viewed from a different angle, and the parabola disappears if viewed from a different angle.

Looking at scatterplots of any 3 out of 5 of those dimensions, it looks pretty random, much less structure than in the 2d plot.

Edit: Oh, wait, I've been using chunks of 419 numbers as the dimensions but should be interleaving them

[This comment is no longer endorsed by its author]Reply

The double line I was talking about is actually a triple line, at indices 366, 677, and 1244. The lines before come from fairly different places, and they diverge pretty quickly afterwards:
However, just above it, there's another duplicate line, at indices 1038 and 1901:
These start out closer together and also take a little bit longer to diverge.

This might be indicative of a larger pattern that points that are close together and have similar histories tend to have their next steps close to each other as well.

Nnotm3y70

For what it's worth, colored by how soon in the sequence they appear (blue is early, red is late) (Also note I interpreted it as 2094 points, with each number first used in the x-dimension and then in the y-dimension):
Note that one line near the top appears to be drawn twice, confirming if nothing else that it's not a rule that it's not a succession rule that only depends on the previous value, since the paths diverge afterwards.
Still, comparing those two sections could be interesting.

1Nnotm3y

The double line I was talking about is actually a triple line, at indices 366, 677, and 1244. The lines before come from fairly different places, and they diverge pretty quickly afterwards: However, just above it, there's another duplicate line, at indices 1038 and 1901: These start out closer together and also take a little bit longer to diverge. This might be indicative of a larger pattern that points that are close together and have similar histories tend to have their next steps close to each other as well.

Interpreting the data as unsigned 8-bit integers and plotting it as an image with width 8 results in this (only the first few rows shown):

The rest of the image looks pretty similar. There is a almost continuous high-intensity column (yellow, the second-to-last column), and the values in the first 6 columns repeat exactly in the next row pretty often, but not always.

DALL·E 2 by OpenAI

Open Thread - Jan 2022 [Vote Experiment!]

Very cool, thanks!

DALL·E 2 by OpenAI

Nnotm3y50

The original DALL-E was capable of having almost the same image with slight variations in one generation, so I'd be interested to see something like "A photograph of a village in 1900 on the top, and the same photo colorized on the bottom".

Rabrg3y150

It did a fairly decent job, though it apparently prefers the colorized versions being on top.

Note that getting DALL-E to work well is a bit like GPT-3: you can usually get much better results by making a few iterations on the prompt.

Nnotm3y1

I haven't read the luminosity sequence, but I just spent some time looking at the list of all articles seeing if I can spot a title that sounds like it could be it, and I found it: Which Parts are "Me"? - I suppose the title I had in mind was reasonably close.

Open Thread - Jan 2022 [Vote Experiment!]

Nnotm3y1

2Seeking

😮 1

Is there a post as part of the sequences that's roughly about how your personality is made up of different aspects, and some of them you consider to be essentially part of who you are, and others (say, for example, maybe the mechanisms responsible for akrasia) you wouldn't mind dropping without considering that an important difference to who you are?

For years I was thinking Truly Part Of You was about that, but it turns out, it's about something completely different.

Now I'm wondering if I had just imagined that post existing or just mentally linked the wrong title to it.

3hamnox3y

I don't remember reading anything like that. If I had to make a wild guess of where to find that topic I'd assume it was part of the Luminosity sequence.

Nnotm4y*130

One question was whether it's worth working on anything other than AGI given that AGI will likely be able to solve these problems; he agreed, saying he used to work with 1000 companies at YC but now only does a handful of things, partially just to get a break from thinking about AGI.

Nnotm4y100

Is that to be interpreted as "finding out whether UFOs are aliens is important" or "the fact that UFOs are aliens is important"?

1James_Miller4y

The second.

Nnotm7y20

As I understand it, the idea with the problems listed in the article is that their solutions are supposed to be fundamental design principles of the AI, rather than addons to fix loopholes.

Augmenting ourselves is probably a good idea to do *in addition* to AI safety research, but I think it's dangerous to do it *instead* of AI safety research. It's far from impossible that artificial intelligence could gain intelligence much faster at some point than augmenting the rather messy human brain, at which point it *needs* to be designed in a safe way.

1Jeevan7y

I'd say we start augmenting the human brain until it's completely replaced by a post-biological counterpart and from there rapid improvements can start taking place, but unless we start early I doubt we'll be able to catch up with AI. I agree on the part that this need to happen in tandem with AI safety.

Nnotm7y20

AI alignment is not about trying to outsmart the AI, it's about making sure that what the AI wants is what we want.

If it were actually about figuring out all possible loopholes and preventing them, I would agree that it's a futile endeavor.

A correctly designed AI wouldn't have to be banned from exploring any philosophical or introspective considerations, since regardless of what it discovers there, it's goals would still be aligned with what we want. Discovering *why* it has these goals is similar to humans discovering why we have our m... (read more)

1Jeevan7y

The formal statement of the AI Alignment problem seems to me very much like stating all possible loopholes and plugging them. This endeavor seems to be as difficult or even more so than discovering that ultimate generalized master algorithm. I still see augmenting ourselves as the only way to maybe keep the alignment of lesser intelligences possible. As we augment, we can simultaneously make sure, our corresponding levels of artificial intelligences remain aligned. Not to mention it'd be much more easier comparatively to improve upon our existing faculties than to come up with an entire replica of our thinking machines. AI alignment could be possible, sure if we overcome one of the most difficult problems in research history(as you said formally stating our end goals), but I'm not sure our current intelligences are upto the mark, the same way we're struggling to discover the unified theory of everything. Like Turing defined his test actually for general human-level intelligence. He thought if an agent was able to hold a human-like conversation, then it must be AGI. He never expected narrow AIs to be all over the place and beat his test as soon as 2011 with meager chatbots. Similarly we can never see what kind of unexpected stuff that an AGI might throw at us, that our bleeding edge theories that we came up with a few hours ago start looking like historical outdated Turing tests.

Nnotm7y20

Whether or not it would question its reality mostly depends on what you mean by that - it would almost certainly be useful to figure out how the world works, and especially how the AI itself works, for any AI. It might also be useful to figure out the reason for which it was created.

But, unless it was explicitly programmed in, this would likely not be a motivation in and of itself, rather, it would simply be useful for accomplishing its actual goal.

I'd say the reason why humans place such high value in figuring out philosophical issues is to a large e... (read more)

1Jeevan7y

Maybe our philosophical quests come from a deep-seated curiosity, which is very essential for exploring our environment, discovering liabilities/advantages that can be very beneficial. Most animals don't care about the twinkling points of lights in a night sky, but our curiosity is so fine-tuned and magnified that we're morbidly curious in almost every thing there is to be curious about. Only the emotion of fear safeguards us a bit, so we don't just jump off cliffs just because we're curious what the motion of prolonged falling would feel like. That said, an AI system without any curiosity would effectively won't be able to take maximum advantage and find the most optimal path without experimenting with plenty different strategies. Do we then ban it from inspecting certain thought experiments like philosophy and introspection and the ability to examine itself. (If we let it examine itself, it might discover these bans and explore why they are in place). We cannot build a self-improving AI without letting it examine itself and make appropriate changes to its code. There could possibly be several loopholes like this. Can we really find and foolproof plug them off. Wouldn't an ASI several orders of magnitude more intelligent than us able to find such a loophole and overcome its alignment set set up by us. Is our hubris really that huge that we're confident that we'll be able to outsmart an intelligence smarter than us?

Nnotm7y30

It would need a reason of some kind of reason to change its goals - one might call it a motivation. The only motivation it has available though, are its final goals, and those (by default) don't include changing the final goals.

Humans never had the final goal replicating their genes. They just evolved to want to have sex. (One could perhaps say that the genes themselves had the goal of replicating, and implemented this by giving the humans the goal of having sex.) Reward hacking doesn't involve changing the terminal goal, just fulfilling it in unexpected ways (which is one reason why reinforcement learning might be a bad idea for safe AI.)

4Jeevan7y

Interesting. Would a human-level or beyond human-level intelligence ever question its own reality and wonder where and what it was? Would it take it up as a motivation to dedicate resources to figuring out why and for what end it existed and is doing all the things that its doing?